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ALGEBRAIC DECOMPOSITION METHODS FOR NONLINEAR SYSTEMS 


Roger W. Brockett 

Division of Engineering and Applied Physics 
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Cambridge, Massachusetts 

Abstract 

Elegant algebraic theories for decomposing dynamical systems into 
elementary pieces have existed for some time in the areas of finite 
automata and linear systems. In contemporary physics, algebraic ideas, 
especially Lie algebras and Lie groups are used extensively to reveal 
and explain structure. This paper is an informal survey bringing 
together some of the important view points found in these areas. We 
find that although it is usually helpful, in many cases linearity is 
not crucial. 
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1. Introduction 


The main point of this paper is that the utility of the mapping 
semigroup discussed by Myhill [1] in the study of the structure of 
dynamical input-output models is by no means limited to the finite 
state, discrete time case. In many different settings it is the 
algebraic structures which one can give this set of maps which reveal 
the possibilities for decomposing the system. The type of decomposition 
one seeks will, of course, depend on the structure one wants for the 
subsystems. The standard structure theorems of algebra provide the 
tools. The class of systems we treat are not characterized by linearity 
but instead they are characterized by the algebraic structures which 
the mapping semigroup admits. 

To be sure, the general principles on which this paper is based 
are implicit in the literature. However, they do not stand out as 
clearly as they might. Perhaps the most impressive specific instance 
of the general idea we are discussing here occurs in the work of 
Krohn-Rhodes [2]. Linear system theory [3,4] itself provides a second 
example. And a third example can be extracted from the important work of 
Wei-Norman f5]. The hope is that the synthesis undertaken in an informal 
way here will make these principles a little more accessible to non- 
specialists. Moreover while it is perhaps not necessary to treat the 
examples in as much detail as is done here, the hope is that this too will 
help lead to a broader understanding of the underlying principles. 

In all cases it is the decomposition of the semigroup which reveals 
the structure of the system. However, we can adopt different rules in 
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effecting the decomposition and in this way get a very flexible theory 
meeting a variety of needs. For example, if the mapning semigroup 
can be given a group structure, then the theory of group decompositions 
can be invoked to get a decomposition of the dynamics. If the mapping 
semigroup admits a matrix algebra structure then again theories are 
available to effect the decomposition. 

The class of systems under discussion here are capable of modeling 
a wide variety of phenomena lying outside the scope of conventional 
linear systems theory. By way of comparison with linear theory, we might 
explain our objective as a search for decomposition procedures which 
parallel the partial fraction expansion method. To emphasize this 
point we show by example (section 5) how partial fraction expansion 
decompositions fall out when this procedure is applied to a linear 
system. We also show how Krohn-Rhodes theory leads to a further de- 
composition of system structure beyond the partial fraction expansion level. 

To many people it has been clear for some time that a broader conception 
of system theory — one might say a general system theory — would be 
very desirable since technology no longer respects the classical lines 
of organizing subject material. Characteristic of this trend has been 
a merging of the continuous with the discrete and a concomitant blurring 
of the distinction between linear and nonlinear analysis. This paper 
may be viewed in this context. 

A number of algebraic terms are used in the text and examples. Some 


4 


of these are not common in the control literature and are explained in the 
appendix. The others can be found in the references cited there. 



2. Automata Theory 


Many of the ideas which we want to discuss find their clearest 
and most elementary statement in the setting of finite state systems. 

In this section we want to recall a few ideas from automata theory 
which will help to put subsequent developments in perspective. 

Suppose we have finite sets U and X together with an evolution 
equation 

x(k+l) « A(x(k) ,u(k) ) ; u(k) e U ; x(k) e X 

We call such an object a finite state system. An important concept 
in the theory of finite state systems is that of the semigroup of 
the system. This might be explained as follows. 

If X has n elements then the total number of maps of X into itself 
is n n . Denote this set of maps by F(X,X). Now the subset of F(X,X) 
consisting of 

s = U U A(A(A...A(A(-,u 1 ),u,)...u _),u ,),u) (2.1) 

it 1 £ n-z n-i n 

n^O u.eU 
i 

can be given a semigroup structure by introducing a multiplication 
which is just composition of maps. We use o to denote multiplication and 
denote this semigroup by SP = (S , o) . It is often called the Myhill 
semigroup. It has only a finite number of elements because F(.x,X) is 
finite. 

There is a second semigroup of interest here and that is the free 
semigroup over U which consists of all finite strings of elements 
u-u 0 ... u with the multiplication operation being concatenation. 

12 p 

We denote this semigroup by U*. Each element in U* gives rise to exactly 
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one element of S according to the rule A*: u,u„... u A(A(. . .A(* ,u.)u«). . .u » 

1 2 p *■ p 

It is immediate that the diagram below is commutative with this definition 
of A*. That is to say. A* is a homomorphism of U* into SR 


U* x u* 


concatenate 


-*U* 


A* x A* 


SR *SR . 


composition 


4 ^ 


A* 


Since is onto SP we may say that SP is the homomorphic image of 
the semigroup U*. 

In semigroups a homomorphism defines a congruence which can be 

"divided out" to get a simpler semigroup. This point of view gives 

rise to an alternative characterization of the homomorphism A** If 

u-u 0 ...u is a string which takes all states back to themselves 
1 2 q 

after q steps then the homomorphism A* takes this sequence into the 
identity of SP. Moreover no other strings are taken into the identity 
of SP so that the kernel of this homomorphism is the set of sequences 
which give rise to closed paths in the state space for each initial 
state. ^ In this sense 

SP 53 sequences/ (sequences giving closed paths) 

It is exactly the insertion of the semigroup SP into the theory 
of finite state systems which makes it possible to study decomposition 
theory using algebraic methods. In fact the introduction of algebraic 


This statement with its topological implications were pointed out 
by me by Prof. D.L. Elliot of Washington University. 



machinery comes about in a very natural way after one more step. Observe 
that we may associate with each element u^ of U a map X(*,u^). If 
s ( ) belongs to *jP then the difference equation 

s(fcfl) * [A(.,u(k))]o s(k) (2.2) 

evolves in the semigroup SP. The solution of this equation is "fundamental" 
in a sense similar to the use of "fundamental solution" in linear theory. 

That is, if s( ) is the solution corresponding to an initial state 
which is the identity element of SP and an input string u^^u^..., 
then the solution at time i of the equation 

x(k+l) = X (x(k) ,u(k)) ; x(0) - x q ; u(-) " 

is the image of x q under the map s(l) viewed as an element of F(X,X). 

■ - We call the equation for s the semigroup equation or the Myhill 
equation. It is important to emphasize that the solution of the 
semigroup equation evolves in a very simple way, regardless of the 
complexities of X . If one knows enough about the structure of finite 
semigroups the decomposition of this equation into simpler pieces can 
be carried out. This step has been carried out by Krohn and Rhodes 
in their important study [2]. In the special case where SP is actually 
a group the Krohn-Rhodes results on decomposition are not difficult to 
explain. The idea is that either the group is simple in which case they 
shew that in a certain sense the system is irreducible, or else it is not, 
in which case the normal subgroups can be divided out to get a decomposed 
system. We give an example in the next section. 

In the remainder of the paper we investigate to what extent we can 
carry over these ideas to infinite state discrete and continuous time systems. 
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3. An Example of a Finite Group Decomposition 

The examples in this paper progress from the easy to the 
difficult • Our first example, illustrating the Krohn-Rhodes 
theory, is interesting because it shows that from the point of view 
of automata theory a scalar first order difference equation (over a 
finite field) can sometimes be further decomposed. 

Consider the system 

x(k+l) = otx(k) 4 - Bu(k) ; y(k) ® x(k) 

where x(k) and u(k) take on the values 0,1,2, and a and (3 are constants 
which take on one of these values and arithmetic is done modulo 3. 

The total number of maps of the state space into itself is 27 - the 
semigroup itself consists of a subset of the following (observe that 
equals <*) 


gj^*) ° ct( - ) 

gy ( * ) = a (•) 

+ 6 

g 2 (-) = Ct(-) + 8 

gg(0 * a (•) 

+ a6 + 6 

g 3 (*) - «(•) + 62 

g 9 <*) = a 2 (0 

+ a62 + 6 

g 4 («) ■ a 2 (‘) 

g 10 (*) ■ a 2 (0 + 62 

g 5 (.) = a 2 (‘) + a6 

gll(*) ■ a 2 (0 

+ a6 + 62 

g 6 (.) = a 2 (*) + a62 

g 12^‘ ^ " a 2 (*) 

+ a62 + 62 


For example, if ct *= 2 and 6=1 then there are 6 maps which are 
distinct. Let’s take these as g^, g 2 » g^, g^, g^, and g fi . A short 
calculation reveals that this group is isomorphic to the dihedral 
group* D y We can take gg and to be the generators. Since 

* 

The dihedral group D is a group of order 2n consisting of all possible 
products of two generators x and y subject to the relations x^l 
y2=l and y x y ® x-1. 





is not simple we can decompose this semigroup and the resulting system. 

By letting z(k) ® 2 x(k), we can write the evolution equation in 

terms of modulo 3 arithmetic as 

z(k+l) ■ z(k) + c* *w(k)u(k) ; y(k) ■ w(k)z(k) 

w(k+l) ■ 2*w(k) 

g 

The semigroup of the second of these is isomorphic to Z 2 whereas 

the semigroup of the first (regarding w(k)u(k) as the input) is isomorphic 

to Zy The appropriate block diagrams are shown below. 



Figure 1 : Linear Sequential Machine Representation of a 
Modulo 3 System. 



Figure 2 : Decomposed Version of the Modulo 3 System of Figure 1. 


* 

Z p denotes the group of integers (0,1,2, .. .p-1) with addition modulo 
p being the group operation. 
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4* Bilinear Discrete Time Systems 

Even if we abandon the assumptions that U and X be finite sets 
it is still possible to utilize the previous definitions for SP and 
the semigroup equation itself. Typically SP will not be finite although 
there certainly are interesting cases for which it is and in these cases 
the Krohn-Rhodes theory will apply. The structure of infinite semigroups 
bn the other hand is not well understood and thus to make further progress 
it is natural to Ibok at systems for which the semigroup admits additional 
structure. In this section we investigate a class of systems for which 
it can be given the structure of a matrix algebra. 

A significant extension of the linear discrete time system is 
the class of systems which evolve in a real vector space R n according 
to the rule 

v v 

x(k+l) ■ (A q + l u (k)A i )x(k) + £ b^u^Ck) (4.1) 

0 i=l i®l 

Here we have a linear dependence on the initial state but a nonlinear 

dependence on the input. What is the semigroup in this case? Since 

we have at each step x(k+l) = M(u)x(k) + n(u) it is clear that the set 

of all maps of the state space into itself is the composition of such 

maps. However, the composition of two such maps is a third map of the 

same form. After a calculation one can see that the semigroup for equation 

(4. 1) consists of maps of the form 

p- 1 v p-1 p-1 v v 

S “ n [A + l u (£)A ]x+ l n [A + l u .(1)1 [i.u.(j) (4.2) 

S.=0 ° i=l 1 1 j=0 ° i=l 1 i=l 1 1 

Recall that a map of Tfcf into is called affine if it is of the form of 
a translation plus a nonsingular linear transformation. This set of maps 
would be affine if the linear transformation part were invertible. There 
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is, however, no need to require invertibility at this point. We call 
maps of the form Mx+b with M not necessarily invertible, pseudo-affine. 
Notice that the semigroup defines an equivalence relation on the input 
space whereby u ^ u 2 they both give rise to the same map. 

It is easy to see that it is possible to put the set of pseudo- 
affine maps in one to one correspondence with a set of n+1 by n+1 
matrices according to the rule 


[o l]~ 8 with = Gx + b 

The set of pseudo-affine maps on TlP is, of course, a semigroup under 
composition. The correspondence defined above is a semigroup homo- 
morphism if we regard the set of matrices as a multiplicative semigroup. 
This hinges on the two calculations which give the effect of semigroup 
multiplication in the respective cases 


±) r G l b lY G 2 b 2] J G 1 G 2 G l b 2 +b ll 

|_0 1 J |_0 1 J 0 1 J 


ii) G 1 (G 2 -rt) 2 ]+b 1 = G x G 2 +C 1 b 2 +b 1 
We denote the matrix semigroup by 

^/(n) = {G:G = [jj *] } 

Having a convenient representation for the semigroup associated with 
equation (4.1), the next step is to display the semigroup equation itself. 
A little thought will verify that the semigroup (4.2) evolves according 


to the equation 



-i J - 




S (k+1) 



0 ' 
1 


+ 


v 


I 

i=l 


u^(k) 



( 4 . 3 ) 


By a matrix algebra (over a fixed field) we mean a set ef square 
matrices which is a vector space with respect to matrix addition and 
scalar multiplication and which is closed under matrix multiplication. 

Since a lot is known about the structure of matrix algebras 
including the extent to which they can be decomposed, the question 
naturally arises as to whether or not these results can be brought 
to bear. Clearly the semigroup is closed under multiplication; after 
all this is the semigroup property. Troubles arise with regard to 
the vector space structure. Even in the special case where the evolution 
equation is 

V 

x(k+l) *» [ l u (k)A ]x(k) 
i=l 

and the semigroup equation is 

V 

S (k+1) » [ l u (k)A ]S(k) 
i=l 

the semigroup is in general not closed under matrix addition. 

Confronted with this situation a natural thing to do is compute 
the semigroup and find the smallest matrix algebra which contains it. 

In fact this seemingly ad hoc solution can be justified further by 
noticing that if we want to obtain bilinear subsystems this is an appropriate 
structure. In a complete theory this point will require careful attention. 
Decomposing this algebra will, of course, decompose the actual semigroup 
although this procedure overlooks the possibility that the semigroup might 
admit a decomposition not shared by the smallest matrix algebra which contains it. 
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What then is the smallest matrix algebra <J{ containing the 
set of matrices 


& - U U 


u i e 1 ^ n>0 


n f A +Eu. (k)A. 

n ° 1 1 

i*0'[ 0 


Zu 1 (k)b 1 *J 


One can't be more explicit than to display it as 


= {M : M = s i e & } 


except in special cases. For example if A q is n by n and if we have 


& - u u 

ueTR niO 


n rA o u(k)bl 

i=0 [0 lj 


then 


- {M : M 


a(A Q ) x 

L 0 a(l). 


a = polynomial of degree n 

x e Range b, A b...A n “^-b} 
o o 


as is easily verified by use of the Cayley-Hamilton theorem. 

By bringing standard algebraic decomposition theorems to bear on 

this problem we can decompose the semigroup and hence obtain a realization 

of the original system which is decomposed To make this important point 

clear, suppose that we can decompose the enlarged semigroup as a direct 

sum of say n parts, M ® M 0 @ . . . © M . Then we can write the 

i z n 

semigroup equation as 


M 1 (k+1) - [A* + Eu i (k)Aj]M 1 (k) 
M 2 (k+1) - [A* + Eu i (k)A^]H 2 (k) 


(superscripts are not powers) 


M (k+1) - [A? + Eu. (k)A^]M (k) 
n u i l n 




c 
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with 8 (k) » EM (k) . Since x(k) ■ s(k)x Q this set of systems obviously 
simulates the original system but is decomposed in the sense of having 
semigroups which are subsets of simple matrix algebras. 
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5. An Example of a Matrix Algebra Decomposition 

Our objective here is to show what this philosophy yields when 
we apply it to a standard situation. 

Consider a linear system 

x(k+l) ■ Ax(k) + bu(k) ; x(k) e 7R n ; u elk* 

As we have seen the Myhill equation can be expressed as 


S (k+1) 

The set of matrices 


u<k)S| 


S(k) 


9 > 


n rA u(k)b* 

U n 

n>0 k=>0 LO 1 


do not form a matrix algebra since it is not closed under addition. 
However if we enlarge it by inserting a v(k) to get 


m 


v(k)A 


. i n r v < k 

U n 

niO k=0 L 0 


u(k)b 
v (k) 


Then we do get a matrix algebra. More concretely, 3t consists of 
matrices of the form 


rp(A) 

L o 


p(l) 




where p is any polynomial of degree n or less and x is any vector in the 


.V, 


range space of b,ab,...A b with v the degree of p. 

We can decompose this matrix algebra to get a decomposition of the original 
system. This works in the following way. Notice that if A has a diagonal 




* 
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Jordan normal form then by the transformation 

c ;][: a r *j 

we can bring A into diagonal form. Thus we have a matrix algebra whose 
elements are of the form 



where x • (x^,X2 » . • . ,x r ) ' is a vector in the reachable set f6r the 
transformed system. Since the matrices of the form 



form a one sided ideal, (R,(p,x) • R^(p,x) C. ^(p.x^^it is easily verified that 


st mgf + ^ + ...+£? n 

* 

where + Indicates a semldirect decomposition in the sense of matrix 
algebras. We leave the details of the repeated root ease to the reader. 


That is to say the St. are ideals which as vector spaces taken all 
together span St. However the vector spaces St. are not necessarily 
orthogonal as they would be in a direct sum decomposition. 
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6. Bilinear Continuous Time Systems 

Carrying these ideas over to the case of ordinary differential 
equations is not as difficult as one might suppose. The assumptions 
we use to insure that the semigroup will have a manageable form are 
very similar to those used in section 4. Instead of matrix algebras, 
matrix Lie algebras are the key to understanding the structure. 

We consider systems of the form 

v v 

x(t) « (A + l u. (t)A. ]x(t) + \ b.u (t) (6.1) 

i-1 i=*l 

Notice that the input-output maps of such systems are decidedly 

nonlinear and this class is not as special as it might look at first 

sight. Moreover, this class of models fill an important gap in the 

—currently available theory because they allow one to model systems 

2 ]/2 

for which the Euclidean norm | |x| |*(Ex^) is preserved and also 

allow one to model systems for which the norm ||x|| *E|x^| is 

preserved. The former condition has significant application in 
systems where energy is conserved and the latter is important in modeling 
continuous time jump processes where the sum of the probabilities is 
necessarily one. Systems in which either constraint is an important 
aspect obviously cannot be modeled as 

x(t) = Ax + bu (t) 

with the system being controllable. Rink and Mohler [6] and the author [7] 

further applications of this model. 

A good deal is known about the controllability of equation 6.1 as 

the result of Lie algebraic techniques, [7-10]. It follows from the variation 
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of constants formula that the set of maps of the state space into 
Itself are all of the form x*->Mx+b. The exact set of M's which can 
appear here are the set of possible transition matrices 
and the set of b's depend on the reachable set. Of course if we 
augment x by adding an additional component which is always one, then 
we have 


_d 

dt 


'x(tn rA o +Zu i (t)A i Zu i (t)b i " 

_ 1 J L 0 0 _ 


x(t)- 

1 


This device allows us to think of 6.1 as being a special case of 


x(t) « [A + 1 u (t)A. ]x(t) 

° i-1 


( 6 . 2 ) 


It is clear that the analog df the Myhill equation appropriate 
for equation (6.2) is the matrix equation 

v 


$(t) - [A + l u.(t)A.]S(t) 
i=l 1 


(6.3) 


The possibilities for decomposing this equation are implicit in the 
very interesting work of Wei and Norman [ 5 ] on the solution of time 
varying linear differential equations. What Wei and Norman show is 
that the smallest vector space of matrices which is closed under the 
operation of commutation^ [A,B] *» AB-BA, plays a decisive role. This 
space is called a Lie algebra and it plays an important role here and in 
related work F7-JO ]. 

The relationship between the commutator and structure of the 
solution of linear differential equations may be explained as follows. 
First of all it is known (see e.g. Wichmann (ill) that if for each i, A ± 
is a piecewise continuous function of time for — 00 < t < 00 and if 
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*(t) » [ l A. (t)]x(t) 
i=l 

then the transition matrices of x(t) .« A. (t)x(t); are related 

A i 1 

to the transition matrix of the total system via 


. . . 4 > 

ZA i A 1 A 2 

with the individual factors on the right commuting, provided that for 
all i and j [A^.A^] =0. 

The proof of this is easy in the case v « 2 and the general result 
follows by an induction. 

Secondly, it is known (see Wichmann [11] or Wei-Norman [5]) that 

if the Lie algebra generated by a set of constant matrices {A^} is 
* 

solvable then the solution of the differential equation x(t) • 

[g^(t)A^ + ... g^(t)A v ]x(t) can be expressed explicitly in terms of 
integrals. 

The preceeding remarks lead to the conclusion that the basic solution structure 
stands revealed in the decomposed version of the Lie algebra generated 
by the A^. If this is a semi-simple algebra then 


S’ - ® 9> 2 © SP y 


@^ r ) 


where the <jP . are simple subalgebras, and the previous analysis shows 
that the transition matrix is 


$ - X X .... X 
r r 1 r 


where the factors belong to the Lie groups corresponding to the 
simple Lie algebras <jp.. If the algebra has a radical in addition 

k 

See appendix for a definition. 
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to the semisimple part then provided that one can compute the solution 
for the simple subalgebras one can arrive at an equation Involving the 
radical which can be solved explicitly. In order to actually solve 
the equation when the subalgebras are not solvable Wei and Norman 
suggest looking for a solution of the form 

8l (t) H i gj(t)H g r H r 

X(t) =» e e . . .e 


What their method rests on is the demonstration of the following fact. 
Let be a basis for SR. Then 


r 8 1 H 1 
n e J J 

j=l 


1 _8 1^1 
n e 2 J 

j=r 


l ; r»l, . . . ,n 

k=l K1 


where each of the is an analytic function of Having 

this at their disposal it is easy to verify that at least for small 
| t | one can find a solution in the given form simply by equating 
the coefficients of on each side of the equation 


d 8 1 H 1 8 2 H 2 
dF 6 e • 


8 r H r 8 1 H 1 8 2 H 2 8 r H r 

• • 6 ® A6 6 i**6 
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7. An Example of a Lie Group Decomposition 

Consider the electrical network shown in figure 1. This model 
illustrates some of the features of a voltage conversion network. 
The equations of motion are (u « 0 corresponds to left switch open 
and right switch closed, u»l corresponds to left switch closed, 
right switch open) 


C l*l 

C 2*2 

Li, 


-(l-u)I 3 + I 
u(E-V 1 )+(1-u)V 2 


Now if we make the replacements x ^ 5 * 2 " V 2 and x 3 ” 3 

and let a * 1//LC 2 , & * 1/A.C^ y ■ E/vU, 6- 1/*^ 



Figure 3 : An electrical network controlled by switches 
then we obtain 
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w , 


We now introduce the affine representation and write 



The smallest Lie algebra which contains these two matrices is a 
6 dimensional algebra whose typical element is 


3 
2 

This Lie algebra contains as a three dimensional ideal the subalgebra 
whose typical element is 



Thus we can decompose the Lie algebra as 

sp + m 1 +& 1 +.^> l 

where ^ indicates the one dimensional Lie algebra and + indicates a 
semidirect product. Let be the solution of the equation 
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and let « (0,^,0)' and " (0,0, Y) 1 . Then the block diagram of 
the decomposed system is shown in figure 4. 



Figure 4 : Showing the decomposed version of systems in block 
diagram form 

Perhaps it is of some interest to carry this analysis a little bit 
further to give a more complete picture of the Wei-Norman method. To 
do this we pick a basis for Jif and proceed as follows. 
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Let ft^, ^3 given by 



Clearly these generate a Lie algebra which is not solvable. We note that 
a direct power series expansion together with the identities 

-a ^l 1 2 

e (^e » + + YT a ^1^1 ’^2^ + * ’ * 

" ^2 + “^3 “ 2 T a ^2 ~ " 3 ^ a "^ 3 "*"* * * 


= ^ cosa + Jij sina 


and also 

afl 2 -a« 2 

e Q^e * cosa + ft^sina 

aJi 3 -aft 

e J ft^e ® ft^ cosa + ftjSina 


On the other hand, 
aft 3 -aft 3 

e J ft_e ** ft 2 cosa - ftjSina 
aft^ -aftj 

e A ft 3 e ■ ft 3 cosa - ft 2 sina 
aft 2 -aft 2 

e ft^e ® ft^cosa - ftjSina 
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Now if we try for a solution of 


X = (u 1 (t)fi 1 + u 2 (t)f2 2 )X ; X(0) = I 

8 1^1 8 2^2 8 3^3 

in the Wei-Norman form we assume X = e e e we have 


A . 0 8 i n i 8 2 Q 2 8 3^3 . 8 i n i. 0 8 2^2 8 3^3 8 i n i 8 2 fl 2. n g 3 fi 3 

X * gjft^e e e + e g^e e + e e 8 3^3 e 


Now use the above to get 

8 l fi l. n 8 2^2 8 3^3 . g i n i 0 “ 8 i n i. 

e g 2 ^ 2 e e " 8 2 e fi 2 e $ 


g 2 (fi 2 cos g x + sing x )X . 


Use this idea twice to get 


6n^-i 

1 - 1 1 2 ‘ 3 3 - g 3 e 1 1 (n 3 cosg 2 + ^sing^e ‘ "e 


g 3 e ^ % -fl 3 e 


g 2 fl 2 8 3 fi 3 


■ g. [«! sing 2 + cosg 2 (n 3 cosg 1 -fi 2 sing 1 ) ]X 
so the Wei-Norman Equations are in matrix form 

8 1^1 + 8 2^2 COSg l + fi 3 sin 8].> + 


8 3^1 s * ng 2 + fi 3 cosg 1 cosg 2 -n 2 cosg 2 sing 1 ) » u^ + u^ 
Decomposed these become 
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k 1 + g 3 sing 2 » 

g 2 cosg x - g 3 cosg 2 sir^ 1 = u 2 


g 2 sing 1 + g 3 


cosg 1 cosg 2 


0 


and finally 


1 0 sing 2 




U 1 

0 cosg^ (*cosg 2 8ingj) 


g 2 

SB 

u 2 

0 sing^ (cosg 1 cosg 2 ) 


o' 
.00 
1 


0 


Notice that 


det 


cosg^ -cosg 2 sing^ 


sing. 


COSg jCOSgj 


cos g 1 cosg 2 + sin ^ cosg 2 = cosg 2 


This set of equations therefore is not meaningful at g 2 » ±rr/2. 
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9. Appendix on Algebraic Structures 

The purpose of this appendix is to collect a few facts about 
groups, associative algebras and Lie algebras so as to make it easier 
for the reader to make contact with the literature. All the definitions 
needed for sections 2 and 3 are contained in Chapter 7 of reference [3 ]. 
Otherwise the book by Rotman [12] is very readable. For algebras (sections 
4 and 5) see for example Greub [16] and Gray [13] and for Lie algebras (sections 

6 and 7) Samelson [14 ] and Jacobson [15] are appropriate. 

A groupoid is a pair (S,*) where S is a set and • is a binary 

operation •: S x S -*■ S. If this binary operation is associative 
i.e. if (s x • s 2 ) • s 3 = Sj^ • (s 2 • s 3 ), then (S,*) is a semigroup. 

A monoid is a semigroup in which there exists an element e such that 
for all s in S, es « se =» s. Monoids which have the additional property 

that for each s in S there exist t in S such that st « ts » e are 

called groups. An abelian group is a group such that s * t = t • s 

for all s and t in S. A group (R,*) is said to be a subgroup of 

(S,*) if R is a subset of S and the multiplication is the same on R as 
in S. The order of a group is the number of elements in it. 

If (S,*) and (R,*) are semigroups and h is a mapping h : S -*• R 
we say that h is a homomorphism if the diagram below "commutes" i.e. 
is consistent. 


h SjS 2 ■ s 3 ^h( 8l )h(s 2 ) - h(s 3 ) 


A homomorphism which is one to one (as opposed to many to one) and onto 


S x s 



R x R 


is called an isomorphism. 
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Now let S be a group and R a subgroup. That is, suppose that 
there is an insertion i such that 


is one to one. We can see that the statement s^ ~ ^ and only if 

there exists r in R such that s^r * 82, defines an equivalence 
relation on S and hence a partition on S. We call the elements of 
this partition cosets. A subgroup R of S is said to be a normal 

subgroup if r e R and s e S means srs ^ c R which is to say; sR ■ Rs- 
for each s in S. We say that a group is simple if its only normal 
subgroups are itself and the trivial group consisting of the identity. 
We will not discuss the decomposition theorems available for groups 
since this is done in the present context elsewhere [ 3 ]. 

An algebra SP is a triple (S,+,*) where (S,+) is a vector space 
over a field and • is a bilinear multiplication. If (A*B)*C « 

A* (B • C) for all A, B and C in S then the algebra is said to be 
associative. Perhaps the most common example of an associative 
algebra is the algebra of n by n matrices with + and • being matrix 
addition and matrix multiplication. A Lie algebra (discussed below) 
is an example of a nonassociative algebra. By a subalgebra of SP we 
mean an algebra SP^<Z SP such that SP^ • SP^ C SP^ and SP^ + <S^ C 

A subalgebra is called an ideal if < 2 ^ •, 2 r 'C, 2 ^. Clearly the sum of two 
ideals is an ideal. An ideal « 2 ^ is called nilpotent if for each s in 
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SP^ there is an n such that s n ■ 0. The sum of all the nilpotent 
Ideals is called the radical. By a matrix algebra we mean a set of 
matrices which is closed under addition and multiplication which forms 
a vector space over its field of definition. 

A Lie algebra is an algebra in which (S,+) is a vector space and in 
which the product (denoted by [ , ]) is bilinear, that is, for x, y 
and z in S we have [(x+y),z] = [x,z]+[y,z] : [x,(v+z)] = [x,y]+[x,z] 
and a[x,y] *» [ax,v] = [x, y]. In addition [ , ] is required to satisfy 
the conditions [x,x] = 0, [ [x,y] ,z]+[ [y,z] ,x]+[ [z,x] ,y] = 0. The latter 
condition, known as the Jacobi identity, is the substitute for associativity 

We need only be concerned with Lie algebras for which S is a set 
of n by n matrices whose entries are real numbers. The Lie product is 
the commutator [X,Y] =» XY — YX. It is easy to see that this product 
satisfies the above conditions. 

Let (H^ } be a set of n by n matrices; the Lie algebra generated 
by {H^} consists of {H^}, all the elements obtained from {H^} by 
repeated commutations, and all the linear combinations of these. A 
subalgebra y of a given algebra is called an ideal if [BP ,i?]C BP 
i.e., for all X c BP and Y e SB the product [X,Y] belongs to SP . 

The set of all elements of SB which are the result of commutation 
of some two elements form the derived algebra. This is denoted by SB ' . 
Clearly SB' is an ideal of SB. The derived algebra of SB' is denoted 
by B£". Continuing, we have the derived series 


SB' C SB' 


c SB 


(h) 


SB 


(h+1) 


(h) 


A Lie algebra SB is said to be solvable if SB 


{0} for some h 
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The sum of two solvable Ideals is again a solvable ideal. The radical 
of SB is the sum of all of its solvable ideals. 

The Lie algebra <2? is said to be semisimple if its radical is {0}. 
It is called simple if it has no ideal other than S’ and {0}, and if 
SB ' 4 0 , The last condition serves to avoid trivial cases. 

The main source of knowledge about the structure of associative 
algebras comes from Wedderburn's theorem. This result can be found 
in reference [13] as a statement about rings. 

There are two main structure theorems of Lie algebras. The first, 
known as Levi's Theorem states that if SB is a finite dimensional Lie 
algebra with radical SB q , then there exists a semisimple subalgebra 
SB^a SB such that given X e SB , there exist unique X q e SB Q , and 
unique X^^ e SB^ such that X « X Q + X^ For the proof of this theorem 
see Jacobson, [15 ] • The second structure theorem explains what happens 
to the semisimple part and goes like this. A finite dimensional semi- 
simple Lie algebra SB may be decomposed into the direct sum SB .■ 

SB © <^2 © • • • » where the SB^ are ideals which are 


simple algebras. 
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10. Appendix on Linear Continuous Time Systems 

Consider the standard time invariant linear system x(t) £ TK t 
u (t) e 

x(t) - Ax(t) + Bu(t) ; y (t) - Cx(t) (10.1) 

Suppose that we assume that this system is controllable and. observable. 


Now consider the set of all possible maps of the state at t ■ 0 into 
the state at sometime later which u can generate. Clearly these maps 
are of the form 

x(t) = e At x(0) + [ e A ^ t_CI ^Bu(a)da 
•'0 


which is an affine map. This set of mapS)Which constitute the semi- 
group of the system, satisfy a very simple differential equation of 

the form S(t) ■ U(t)S(t). More specifically. 

At 


At 


-e 

X "1 



a 

0 

-0 

1 - 



x "I rA Bu(t) *1 re x 1 
a a 

1 J LO 0 J L 0 1 - 


where 


x a (t) 


e A (t _0 > Bu ( 0 ) da 

0 


The subset of the n-dimensional affine group which consists of 


= U Tn 5 x e Range of (B,AB,...A n X B} 

t^O L J 

is in general not a group since t is restricted to be nonnegative. It 
will be called the semigroup of the linear system by analogy with the 
standard definition of the semigroup of a machine in automata theory. 
Notice that having the solution of the semigroup equation 



_d r s u (t) 

dt L 


S 12 

1 



A 

0 


Bu(t)-|rs u 

0 Jl- 0 


(t) 


S 12 (t) ] 


with the initial condition being the identity matrix (the identity inj^n)) 


gives the solution of equation (10.1) via the rule 
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INTRODUCTION 

i 

Many problems in control and other areas of applied mathematics lead to 
stability questions for dynamical systems which are described bv mathematical models 
involving time-varving parameters. Frequently one may assume that these time- 
varving parameters are stochastic processes with known statistics. Tvoical examples 
of interesting applications which lead to such stochastic stability questions are 
the stability analysis of numerical computations in the face of round-off error, 
systems involving the human operator, sampled data systems with jitter in the sampl- 
ing rate, mechanical systems subject to random vibrations, and economic systems 
which model some of the uncertainties as variable lags. 

Essentially all of the above examples lead to mathematical models in which the 
stochastic processes enter the model in a multiplicative way. It is for this class 
of systems that the stochastic stability question becomes interesting and challeng- 
ing. In contrast, when the stochastic processes enter the model in an additive way 
as, for example, in the linear quadratic theory, then the stochastic stability 
question usually reduces to the stability of the deterministic svstem obtained by 
putting the stochastic processes equal to zero. 

In this paper we will analyze a class of stochastic svstems and obtain various 
explicit stability criteria. Before we describe che model let us introduce the 
following notation: R denotes the real number system, R n denotes n-dimensional 

real Euclidean space, R^ 0 denotes the real mxp matrices, prime denotes transpose, 
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^0 (> 0) means that a symmetric matrix is nonnegative (positive) definite, A[*l 

denotes an arbitrary eigenvalue of a matrix, whereas A f • ] (A [*]) denotes the 

max min 

maximum (minimum) eigenvalue of a matrix with real eigenvalues. Re denotes the real 
part of a complex number, max [•,•] (min [•,•]) denotes the maximum (minimum) of two 
real numbers, and <?{•} denotes the expected value of a random variable. 

We will study the stability of the linear system Z described by the differential 
equation: 

Z : x - Ax - BK(t)Cx , 

where x e R n and A c R nxn , B e R nXm , and C e R^ Xn are constant matrices and K(t) is 
a time-varying function taking values in R mxp . The differential equation Z will be 
viewed as describing the closed loop dynamics of the feedback interconnection of the 
stationary linear system 

Z x s ■ Ax x + Bu^ ; y x - Cx x 

in the forward loop, and the memory less time-varying linear system 

r 2 : y 2 " K(t) u 2 

in the feedback loop. The feedback Interconnection equations are given by: 

V~y 2 * u 2 * * s easily verified that we Indeed have Z » E^xl^l feedback. 

This feedback system is shown in Figure 1. 



Figure 1 : l viewed as fee<ft>aclc * 

We will assume throughout, for simplicity, that Z ^ *» {A,B,C} is minimal 
(i.e., (A,B) is controllable and (A,C) is observable). The transfer function of Z ^ 
is given by C(s) * C(Is-A) *B. The gain matrix K(t) is assumed to be a stochastic 
process whose properties will be described in more detail later. We seek conditions 
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on the statistics of K(t) which guarantee the stability of Z (to be defined later). 

If we consider the equation for Z from a state space point of view then it is 
apparent that the case where K(t) Is a colored process is quite distinct from the 
case that K(t) Is white. If K(t) is white noise then the system behaves pretty much 
like a linear one and we may use most of the theory on stochastic differential 
equations directly as for example the Lyapunov techniques for stochastic systems 
(see e.g., Kushner [1967), Chapter 2). If on the other hand K(t) is a colored pro- 
cess then we should model Z as something like: 

z - Fz + Ow ; K «* Pz 
x - Ax - BKCx 

with w white noise. This case is thus inherently nonlinear. The results obtained 
in this paper fall into two categories. In the first class we consider the colored 
case and show how one may use what are essentially linear techniques to obtain con- 
ditions for almost sure asymptotic stability of Z . The method of proof uses 
Wazevski's inequality previously exploited in this context by Infante [19681. These 
criteria are thus independent of the autocorrelation function of K(t). 

The second class of results considers the white noise case and shows how 
one may use the frequency-domain stability criteria for linear systems in order to 
obtain criteria for mean square stability of Z . This question has been studied 
extensively in the literature and the results obtained here complement those obtain- 
ed bv Willems and Blankenship (19711 and Willems f 1° 7 2 1 . 

1. AVERAGE VALUE CRITERIA FOR ALMOST SURE STOCHASTIC STABILITY 

In this section we will assume that the entries of the gain matrix Kft) Z R XD 
are stationary stochastic processes satisfying an ergodicitv hypothesis which ensures 
the almost sure equality of time averages and ensemble averages. Thus If F : R mXt) -*-R 
is integrable then we assume that almost surelv: 

(t +T 

*?fF(K(c))> = <*?fF{K(n))> - 11m ^ | ° F(K(T>)dT . 

T-*° T >t 

o 

We will consider almost sure asymptotic stability. This is defined as: 
Definition 1 : Z is said to be almost surely asurrptoticallu stable if the eoualitv 
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lim x(t) - 0 holds with probability one for all given initial conditions x(t ft ). 
t-*» 

1.1 A Stability Criterion for Completely Symmetric Systems 

Consider the system There are various wavs of describing its response 

function from the inputs to the outputs. The most commonlv used input/output des- 
criptions of E give either its transfer function G(s) • C(Ts-A) or its impulse 
response W(t) ■ Ce At B (t ^ 0). There is however an alternative input/output des- 

i 

cription which, although it has roots going back at least as far in time as do the 
concepts of transfer function and impulse response, has become particularly pre- 
velant in the last half decade. This description gives the so-called F.ankel matrix 
of defined by: 


CB 

CAB 

. . . 

ca n b 

. . . 

CAB 

ca 2 b 

... 

ca n+1 b 

... 




# 

• • * 

• 

• 

• • • 

• 

♦ • • 


• 

• • • 

• 

« • i 

ca n b 

ca n+1 b 

i • • • 

CA^B 




... 

* 

• • • 

♦ 

• 

... 

• 

• • • 

. 


... 

‘ 

• % » 


(W< 1+ ^ 2) 


( 0)1 . 


It turns out that many qualitative input/output properties of E^are most easily des- 
cribed in terms of H. 

It is well-known that there exist manv minimal realizations {A,B,C> of a given 
G ( 9 ) , W(t), or H, but that they all may be recovered from one of them by the transfor- 
mation group {A,B,C} S {SAS“ 1 ,SB,CS‘ 1 } with S an arbitrary invertible element of 
R nxn . The dimension of a minimal realization of a given transfer function is called 
the McMillan degree . 

We will consider the following class of systems 1^: 

Definition 2 : ly is said to be completely symmetric if m«p and* H - H f > 0. 


*The infinite matrix H is said to be nonnegative definite (denoted by > 0) if all its 

v 

*’ j 1 4 

finite truncations are nonnegative definite, i.e. if \ z’CA Bz > 0 for. all 

N i>j “ n 

N and for all sequences 
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The following lemma gives a very useful alternative characterization of 
completely symmetric systems. Its proof, which is not germane to our purposes, is 

an immediate consequence of some known facts in realization theory and is left to 

* 

the reader. 

Lemma 1 : E ia completely symmetric if and only if its transfer function C(s) - 

-1 * 

C(l8-A) B admits a realization {A it B^,C^} with A^ - A^ and • C|. 

Thus E^ is completely symmetric if and only if there exists a nonsingular (nxn) 

matrix S such that SAS * * (SAS *)* and SB • (CS *)*. Corapletelv svmmetric systems 

have the property that the eigenvalues of A are all real. This is in fact also the 

case after applying symmetric feedback and it may be shown that ^ is completely 

symmetric if G( s) ■ 0'{s) and if A-BKC has real eigenvalues for all K - K*. Note 

also that E is completelv svmmetric if and only if its transfer function admits the 

k 

partial fraction expansion G(s) - 7 — t— with R - R* > 0. If m«p»l thenE, is 

i*l S+A i 11 1 

completely symmetric If and only if the poles and the zeros of the transfer function 

G(s) are real and interlace, i.e. if X,,X-,...,X are the poles and if z. ,z_, • . . ,z 

i z. n i / r 

are the zeros of G(s), then r * n-1, X^ and z^ are real, and X^ > > X^ > * . . > 

z , > X . This pole-zero pattern is illustrated in Figure 2. 
n— I n 



Figure 2 : Typical pole/ zero pattern of a completely symmetric system . 

Completely symmetric systems are a natural generalization of relaxation systems 
(see Willems [1972]) which are completely svmmetric systems which satisfy the 
additional stability requirement Re X[A] £ 0. Thus is a relaxation system if and 
only if its transfer function admits a realization {A^ f B^,C^} with A^ ■ A* £ 0 and 
- C^. There are various other ways of defining a relaxation system. It mav be 

*The backgroundmaterial of realization theory used here may be found in Brockett 
[1970], Chapter 2, or Kalman [196°], Section 10.11. 
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shown that defines a relaxation svstem if and only if H ■ H' > n and OH - OH* £ 0, 

where OH denotes the shifted Hankel matrix of i.e., H with the first block row 

(or column) deleted. Alternativelv , Z^ defines a relaxation system if and onlv if 

At 

its impulse response W(t) - Ce B is a completely monotonic function on fO,®), i.e. 
k d k 

W(t) - W'(t) and (-1) “V W(t) > n for all t > 0 and k - 0,1,2,... . Relaxation 

dt 

systems play an important role in phvsics. They describe the response of various 

classes of systems such as R-C and R-L electrical networks, viscoelastic materials 

thermal systems, and chemical reactions. 

We now state the main result of this section. 

Theorem 1 : Assume that Z^ is completely symmetric and that K - K' almost surely . 

Let T [A-BKC ] } . Then Z is almost surely asymptotically stable if 

max max * r 

1 < 0. 
max 

Proof : The proof of Theorem 1 follows an argument due to Wazewski adaoted to the 

case under consideration (as in Brockett [1970], Section 32, Exercise 6). 

Since Z^ is completely symmetric, there exists a nonsingular matrix S such that 
A x - SAS -1 = (SAS -1 )' - A| and B. • SB • (CS -1 )' = C^. Let x x - Sx. Then x 1 satis- 
fies the equation: 

x x - (A 1 -B 1 K(t)Bpx . 


Let V(x^) * x i x i* Then along solutions of the above equation, we have: 

V(x 1 ) - 2x’ (A 1 -B 1 K(t)B')x 1 , 
which, since A 1 -B^K(t)B| is symmetric, shows that: 

*<*!> i • 

Since Aj-BjKCtJB^ and A-BK(t)C - S - ^ (A^-B^K(t) B|) S are similar matrices, this vields: 

V(x) < 2A [A-BK(t)ClV(x.) . 

— max l 


Thus 


V(x,(t)) < V(x, (t ))exn(2f A [ A-BK (t ) C] dt ) . 

1 — 1 o ) ^ max 


Finally by the ergodic hypothesis 


i f 


t +T 
o 


lim ^ v X [A-BK(T)C]dT « X 

I max max 

o 


T-ko T ), 
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la almost surely negative, which shows that 11m V(x-(t)) - 0 almost surely. Thus 

t-*® 

lim x^(t) » S lim x(t) » 0 almost surely, which proves the theorem.® 

t-*® t“*°° 

Note ; 1. Theorem 1 predicts stability if Re A (A) < 0 and K • K' _> el > 0 almost 

surely. It then reduces to a special case of the multivariable circle criterion. 

The major difficulty in applying Theorem 1 is that as a rule A will be 

max 

difficult to compute from the distribution of K since X [A-BKC] is a very nonlinear 

max 

function K which does not even admit a general analytic expression. This difficulty 
may however be overcome in the important special case that there is only one stochastic 
gain in I: 

Theorem 2 ; Assume that m*p*l and that ia completely symmetric with transfer 
function g(s) - C(Is-A) . Let z ^ be the largest zero of g(s) and assume that K(t) 
possesses the density function p(K). Then t is almost surely asymptotically stable 
if: 

Proof : By Theorem 1 It suffices to prove that the integral in the theorem statement 

equals -A . Consider therefore A [A-BKC]. Since the eigenvalues of A-BKC are 
the poles of the system obtained after putting the constant feedback gain K around 
It follows that these eigenvalues are the zeros of 1+Kg(s). Since the poles and 
zeros of g(s) are real and interlacing it follows from a simple root-locus consider- 
ation that the maximum zero of 1+Kg(s) is a monotone decreasing function of K which 
varies from z. for K ■ • to +°° for K • -®. The gain K and X f A-BKC 1 are in fact 

related by g(A [A-BKC]) * - ^ . Thus, bv a standard formula from probability theory 
max K 

we have that: 

R (O) 

which yields the desired result. ■ 

Notes : 2. Figure 3 shows the behavior of the functions g(o), -l/g(a), and 

X (A-BKC). The qualitative behavior of these functions is very well understood as 
max 

a result of exhaustive analysis of R-C and R-L electrical networks (see, e.g. , 

Cuillemin [1957], Chapter 4). 
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Flgure 3: Sketch of g(a) f f(a) 


and X (A-BKC) 
max 


3. Theorem 2 indicates the destabilizing effect of the stochastic gain. To see this, 

let us assume (essentially without loss of generality) that <p(K) ■ 0* It may be 

shown that X (A-BKC) is a strictly convex function of K which bv Jensen’s in- 
max 

eauality (see Feller [1966], p. 151) implies that X max 1 with eaualitv holding if 
and only if K » 0 almost surely. Note also that Theorem 2 is easily extended to the 
case where K does not possess a density function. 



X (K) denote the zeros of p(s)+Kq(s). From root-locus considerations it is easily seen 
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that < X^(K) c X (0) < X^(-K) < z t ^ for K > 0 and i-l,2,...,n (where we have put 

A A n n 

z ■ 00 and z » -°°) . Since - F X + Kg , * - F X,(K) we thus obtain the follow- 
o n , u - i n— 1 . , 1 

i-1 i-1 

ing uoper bound for A (see Figure 3): 
max 


X < 

max — 


Wl 


for K > 0 
for K < 0 


This shows that Z is almost surelv asymptotically stable if : 


£ {minf- 


K-A„ 


4 n-l 


]} > 0 


n-1 


which requires in particular that X^ < 0. 

Examples : 1- If K is uniformly distributed between the limits K and K + then Z is 

almost surelv asymptotically stable if: 


g(z+) g(0 


♦f- -4rr > 


R(a) 


where z , ■ X (K.) and z “X (K ) . This inequality is easily verified directly 
*r max + - max - 

from the graph of f(a) • 


g (a) • 


2. The limiting behavior of X as K + t « is given by (see Figure 3): 


-J ’* 

J -Ko - 

( n " X 


for K • 
for K + -<® 


n-1 


where a - J X - J z 

j _ 1 * i . « * 


i-1 


i-1 


Jill 

Vl 


- p , . Thus as K becomes more and more distributed 
n-1 


at large absolute values we see that almost sure asymptotic stability results If: 

n n-1 (0 

z i p + * < l A i • l z i> p _ ~ Vi ' K p(K)dK < 0 

i-1 i-1 "* J 

where P + = P(K > 0) and P_ - P(K < 0). For the uniformly distributed case studied in 
Example 1 with K + > 0 and K_ < 0 this condition requires 

n n-1 

*i K + + l *i )K . + Vif <0 • 


i-1 


3. Consider the equation studied by Infante [19fi8], p. 11: 

B 


i - Stets „ , Xc 


n - Xc 



where 0 f l, X > 0. This equation describes the kinetics of a simple nuclear reactor 
problem. It is easily seen that Theorem 2 applied to this case with 


1 q+A 

g(s) - j 5 and k(t) - -f(t) . 

* 9(s+|+X) 

Thus almost sure asymptotic stability results if: 

f (a-X)p( SS=USiS*Sl y( i + \ B ±- )da < n 

'0 ° 

where p(*) denotes the density function of f. 

1.2 A Frequency- Domain Stability Criterion 

In this section we will derive another criterion for almost sure asymptotic 
stability of the system Z. We first recall the definition of a positive real 
function: 

Definition 3 : Let H(s) be a matrix of real rational functions of the complex variable 

s. It Is said to be positive real if H(s) + H'Cs) 0 for all Re s > 0, s ^ poles 
of H(s) . 

There exist various equivalent conditions for positive realness. Such conditions 
may be found in most books on electrical network synthesis (see, for example, Ouillemin 
[1957], Chapter 1, or Newcomb [1966]). Positive real functions plav a fundamental 
role in the theory of passive systems, particularly in the analysis and synthesis 
of electrical networks. They have recently also shown to be an essential tool for 
obtaining frequency-domain stability criteria for feedback systems. A time-domain 
condition for positive realness is given in the following lemma, the celebrated 
Kalman- Yacubovich- Popov lemma: 

Lemma 2 : Consider the minimal system: 

2 » Fz + Gv ; w - Hz, 

and let a he a real number. Then H(I(s-o)-F) V, is positive real if and only if 
there exists a solution 0 - O' > 0 to the relations: 

F'Q + QF _< -2a0 ; 

OC - H 1 . 

For a proof of Lemma 2 we refer the reader to Willems [1972]. 
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The value of the above lemma In stability analysis lies in the fact that the 
quadratic form Induced by the matrix 0 yields a very suitable candidate for a 
Lyapunov function. It plays a crucial role in the following theorem which is the 
main result of this section: 

Theorem 3 : Let m « p. Then T. is almost surely asymptotically stable if there exists 

a constant (mxm) matrix A and a real number o such that: 

(i) A + A 1 > 0 ; 

(ii) F(s-c) « G(s-a) (I-A(s-o)G(s-o)) * is positive real; 

and (Hi) . [(K+K’XA+A')” 1 ]]) > n . 

rom 

Proof : We will assume that (I-ACB) is invertible and that the McMillan degree of 
F(s) is n. The general case may be resolved by a subsequent limiting argument which 
is left to the reader. 

It is easily seen that F(s) is the transfer function of the system: 
z ■ Ax + B(v+Aw) ; w » Cz 9 
or 

i - (A+B(I-ACB) -1 ACA)sf+B(I-ACB) _1 v ; v - Cz . 

This system is minimal since the McMillan degree of F(s) is assumed to be n. Thus 
by condition (ii) and Lemma 2 there exists a matrix 0 - 0 1 >0 such that 

[A+B(I-ACB) -1 ACA] '0+Q[A+B(I-ACB) -1 ACA] _< -2CQ 

and 

QB(I-ACB) -1 - C’ 

Let S be an invertible (nxn) matrix such that S f S - Q and let x^ » Sx. The 
equation for is given by: 

*i - (A 1 -B 1 K(t)C 1 )x 1 

where - SAS”\ ■ SB, and ® CS Moreover, » CJ(I-ACjBj) and 
(Ai+C^ACiAi> * + (A^+CjACjA^) £ -2oI. Consider now the derivative of V(x^) ■ xjx^+ 
y^Vy r where y^ ■ C^x^, a ^ on F solutions of the above differential equation. A simple 
calculation using the above relations shows that: 


V(x x ) £ -20x^x 1 -2y^K(t)y 1 • 


1 * 


-20V(x 1 )+2y^(aA^K(t))y 



-44- 


Let A(t) » ^ min [ (K(t)+K f (t) ) (A+A 1 ) *] and let P be a nonsingular matrix such that 
P’P -A +A\ Since X(t) - X mln t p (t)) (P*)" 1 ] it thus follows that > 

x (t)y^Ay 1 for all y . Hence 

V(x 1 ) < -2aV(x 1 ) + 2(a-A(t))v»A yi . 

We now distinguish two cases; 

(i) A(t) > a which implies V(x J ) < -2aV(x x ) ; 
and (ii) A(t) £ a which, since V(x^) A y^, Implies: 

V( Xl ) < -2aV( Xl ) + 2(a-A(t))V( Xl ) - -2A(t)V( Xl ) . 

Hence 

V(x x ) < -2min[a > A(t)]V(x 1 ) 

and 

V( Xl (t)) < V( Xl (t o ))exp(- 2 | min(A,c(t)]dt) ’ 

C o 

By the ergodic hypothesis and condition (iii) this indeed implies that lira V(x (t))»G 

t-*“ 

almost surely. Thus lim x^(t) = S lira x(t) - 0 almost surely, which proves the 
t-**> t-«» 

theorem. ■ 

Notes: 5. If K + K* > el >0 almost surely and if G(s) is positive real Chen 
Theorem 2 predicts almost sure asymptotic stability by considering the limit a + 0 
and A -► 0. In this sense Theorem 2 is thus a generalization of the circle criterion. 
The advantage of the theorem is that it allows the gain K(t) to become negative 
provided however this is compensated by R(t) being sufficiently positive at some 
other time. 

One of the disadvantages of Theorem 3 is the inherent difficulty in verifying 

the average value condition from the distribution of K since A min ( { (R+K r ) (A+A * ) *1} 

is a very nonlinear function of K. In the scalar case however one may resolve the 

various conditions in Theorem 3 much further. Thus we arrive at the following 

more explicit criterion for systems with a single stochastic parameter: 

-i < i n -i sn ~ 1+ ' • 

Theorem 4; Assume that m * p * 1 and let g(s) » C(Is-A) B - 

- n , n-I, , 

9+Vl s +...+P o 

denote the transfer function of 1^. Then T. is almost surely asymptotically stable 
if there exists a real constant .ft such that 
(i) <5 {mintB ,K] } > 0 ; 
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(ii) the poles of 0 (s) lie in Re s < ; 

end (ill) the locus of G(ju)-q . 2 ), -® < w < <» > does not encircle or 

n-1 

intersect the closed disc centered on the negative real axis of 

the complex plane and passing through the origin and the point - —• . 

Proof : Bv Theorem 3 it suffices to show that there exists a constant A > 0 such 

that F(s-a) “ g(s-o) (l-A(s-0)g(s-0)) * is positive real and £ {minfaA ,k] } > 0. Note 

that this implies a > 0 . Now F(s-o) is positive real if and only if F ^(s~C) • 

7 “~r - A (s-0) is positive real. Since F~*(s-0) » ( — A) (s-0) + r ^ 9 - v r with 

g(s-a) q 1 qfs-a) 

i -i 

r(s) a polynomial of degree at most (n- 1 ) it follows that X < and that F (s-0) 

Vi x 

. will be positive real for some A if and onlv if it is positive real for X ■ , 

°n-l 

which is thus the optimal value of A to consider. The condition o , > 0 follows 

n— I 

from the frequency domain condition (iii) as a result of the behavior of g(j(o-0) for 

a) -► 00 . Pick now a ■ 8q , . 

n-1 

In order to complete the proof of the theorem it suffices to show that F *(s-o)» 


—7 rr* s + 6 is positive real. Bv one of the test of positive real- 

8 (s "Vi B) Vi 

ness this can be achieved bv proving that Re F ^(s-0)] . > 0 and (since F ^(s-0) 

S*jU — 

has no more zeros than poles) that the roots of a (s-0) lie in Re s < 0. The real part 

condition comes down to asking g(s-0)] . to have the non-intersection property 

s=*jw 

stated in condition (iii). By the non-encirclement condition the roots of o(s-0)+ 
kq(s-0) lie in Re s < 0 for k > 8. By letting k -► 00 this implies that the roots of 
q(s-tf) lie indeed in Re s < 0. By the non-intersection property g(jo>-0) 1* 0 for 
-» < o> <■ 00 and we conclude that the roots of q(s-o) indeed lie in Re s < C as desired. 
Notes : 6. It may be shown that conditions (ii) and (iii) of Theorem 4 will be veri- 
fied for 8 if they are verified for Thus the optimal 8 to consider is the 

smallest number which satisfies condition (i) of the theorem. 

7 . If K has density function p(K) then condition (i) of Theorem 4 requires that: 


h(6) - 8 I p (K)dK + Kp(K)dK > 0 
'8 


Now 0 , h( 0 ) _> 0 and h(«) - Thus there exists a 8 such that h(8) > 0 

if and only if <f {k} > 0, and tf so, then there exists a 8 such that h(8) > 0 for 
8 > 8*. * Thus Theorem 4 will predict almost sure asymptotic stabilitv of Z 
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if <?[KJ > 0, if the poles of j>(s) lie in Re s < q ,0* and if g(J<u-q ,0*) satisfies 

n- x n- i 

the frequency domain condition of Theorem 4. This procedure lends itself very nicely 
to the graphical analysis illustrated in Figure 4. 




Im 


Figure 4 : Illustrating the application of Theorem 4 . 


Examples : 4. Assume that K is uniformly distributed between K_ and K + with K_ £ fl 

and K + + K 0. Then 8* * K + -yK^-K^ • Expressed in terms of the spread AK-K + -K_ 


K-K 


and the mean M 
AK 


2 this yields 8* * (y — ^ ^ ) which in the range of interest 


2 i M 0 shows that 8 increases with AK for fixed M. This again indicates the 
destabilizing effect due to the uncertainty in K. 

5. Let be a completely symmetric system as defined in Section 1.1. Then con- 
ditions (ii) and (iii) of Theorem 4 will be satisfied as long as q^^B < -X with X^ 

^1 

the largest pole of g(s). The stability condition then becomes <f {min[ ,K]} > 0 

which is similar to» but more conservative thar\ the condition obtained in Note 4. Thus 
Theorem 2 which only applies to completely symmetric systems gives a sharper stability 
estimate than Theorem 4 which applies to general systems. 


2. ANALYSIS OF THE MEAN AND THE COVARIANCE EQUATIONS 
This last section of the paper is concerned with the stability analysis of the 
mean and the covariance of the state of E where K(t) is assumed to be a white 
stochastic process. For simplicity we will consider only the case In which the 
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process K(t) is scalar valued, but we will treat the non-s tationary case. If we 

— —2 

denote the mean of K(t) by k(t) and the variance by o (t) then X is described by 
the stochastic differential equation: 

X 1 : dx • (A-k(t)bc)x dt + q(t)bcx dB , 

where A £ R nxn t b £ R nX \ c c R^ Xn , and B denotes a Wiener process with zero mean 
and unit covariance* This stochastic differential equation is to be interpreted in 
the sense of Ito and we will take it as the starting point of our analysis. 

It is well-known that if k(t) and q(t) are sufficiently smooth (e.g., locally 
integrable) then for all given x C t^) there exists a uniaue solution to X' for 
t >_ t Q . Let y(t) - <?{x(t)}, T(t) = ^ , {x(t)x f (t) } , and R(t) - £ {(x(t)-y(t)) 

(x(t)-y (t) ) 1 } denote respectively the mean, the second moment matrix, and the 
covariance matrix of x(t). These are governed by the equations: 

ti - (A-k(t)bc)y ; 

f - (A-k(t)bc)r+r (A-lc(t)bc) '-t-q^CObcrc'b * ; 

and R(t) - r(t)-u(t)li’(t), 

with initial conditions y(t ) * x(t ) and T (t ) ■ x(t )x*(t ). 

o o o o o 

We will be concerned with the asymptotic properties of these variables. The 
relevant stochastic stability concepts are now defined: 

Definition 4 : X' is said to be asymptotically stable in the mean, in the mean 
square, or in the covariance if, respectively, lim y(t) » 0, lim T(t) - 0, or 

t-K® 

lim R(t) - 0 for all given initial conditions x(t ). 
t*** ° 

It is easily seen from the relations T(t) - R(t)+U (t)y 1 (t) and R(t) • R f (t) 0 

that mean square asymptotic stability implies stability in the mean and in the 
covariance. The stability of the mean is a standard deterministic stability 
problem for which many criteria have been derived. These criteria involve the trans- 
fer function g(s) ■ c(Is-A) \ and properties of k(t) as, for example, its bounds 
(e.g. in the circle criterion: see Brockett [1970], Section 35), bounds on its 
derivative, or its periodicity. The stability of the differential equation which 
expresses the evolution of the second moment matrix T(t) is much more intricate 
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to analyze and we will show how criteria like the multivariable circle criterion 
—2 

may be used. If q (t) • 0 then its stability Is equivalent to the stability of 
the mean equation, whereas if q (t) i* 0 then more stringent conditions will have 
to be imposed. 

* 

2.1 Multilinear System Theory 

It is easy to see that if x^ and are vectors which satisfy the linear 
equations: 

n l 

*1 - A 1 (t)x 1 ; x x e R , 

n 2 

and x^ " Aj(t)x 2 * C R , 

then the product x^x^ satisfies also a linear equation, namely: 

df XjX* - A^tJx^J + XjX^AjCt) . 

By taking » x^ we see that If x satisfies a linear equation, then so does 
xx’ . 

This idea generalizes from quadratic forms to homogeneous p-th degree forms. 
These facts have been known at least since Lyapunov's thesis, but they have to 
the present time been used very little in system theory. They may for example be 
exploited in the minimization of homogeneous performance measures of degree p > 2 
for linear dynamical systems. 

The above ideas may be used in setting up transfer functions for a class of 
bilinear systems. We will make some use of the Kronecker product denoted here by 
® . Thus the Kronecker product of M c R™* and R e R pxq is the element 
M ® R e defined by: 


’« n R 

a 12 R 

• • • 

i 

m, R 

iq 

» 21 r 

m 22 R 

• • • 

tn„ R 

9 

, 

* • • 


• 

. 

• a • 

• 

• 

* 

* • • 

• 

m R 
Pi 

V R 

1 

m R 

. M 1 


The main use of this notation is that if an (nxn) matrix 0 is written in lexo- 

2 

graphic notation as the n -vector 
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°v - eol( qil , q 12 , .... q ln> .... q nl , q n2 9 nn > 

then (MQ) v - (I ® M)O v . 

Consider now the following lemma: 

Lemma 3 : Let {A,b,cl be a minimal realization of the transfer function g(s) * 

c(Is-A) *b. Then the differential equation: 

0 * AQ + OA + bv 1 + vb * ; w * cO , 

defines a minimal realization on the n dimensional space of symmetric (nxn) 
matrices of the transfer function: 

r 2 1 ~l 

g L J (s) - (c® I+I©c) (Is-I® A-A® I) (b®I+I®b) . 

We will not give a detailed proof of this lemma * The proof exploits the fact 
that the above matrix equation describes the bilinear system 

xx ' « Axx ' + xx'A 1 + bux* + xu'b' ; 
at 

yx * - cxx 1 

where x ® Ax + bu; v - cx. 

The dynamical system identified in the statement of Lemma 3 plavs an important 

role in the analvsis of the covariance ecmation under consideration. We know 

from this lemma that controllabilitv and observability will be preserved. The 

(21 

poles of g l ] (s) are given by (A^(A)+A^ (A) } , i , j«l , . . . ,n . There appears to be no 

[21 

convenient general formula for deriving g (s) from g(s). In a specific case 

(21 

however, it is a relatively straightforward matter to calculate g (s). 

Example : h. Let [A,b,c1 be the standard controllable representation (see Brockett 

[1970], p. 106) of 0(s) = -y . Then 

s +as+b 


1 

2 (s+2a) j 

4 

s ^+3as^+(2a^+4b)s+4ab 

s(s+2a) | 

?s 



2 . 2 The Circle Criterion for the Covariance Fquation 
We now return to the covariance enuation: 

f - (A-k(t)bc)r + r(A-k(t)hc)’ + q 2 (t)bcr c 'b ' 
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which we model as the feedback system: 

: Q ■ AO + QA'-fbv , 4vb , +bwb ' ; y • cO, z ■ cOc* , 

: v “ “k(t)y, w - q 2 (t)z . 

It follows from Lemma 3 that is completely controllable and completely observable. 
Let 



where y(s) - G^(s)v(s) + ^ 12 (s)w(s) 

and z(s) - G 21 (s)v(s) + C 22 (s)w(s) , . 

denote the transfer function of E|. It is easilv calculated that C(s) is given bv: 

c ® I + I ® c . 

G(s) - (Is-A® I-I ® A)" [b®I+I®b| b®b] 

c ® c 

Thus the stability of the covariance equation is equivalent to the stabilitv of a 
deterministic feedback system with (n+1) feedback loons, with transfer function 
G(s) in the forward loop and gain matrix 



in the feedback loop. 

The multivariable circle criterion and its various generalizations is thus 
immediatelv applicable to this situation. We will illustrate this onlv in the 
simplest case. Let ||*j| denote some norm on and let matrix norms be 

induced norms. The small loop gain theorem due to Zames fl9f>6] thus leads to: 
Theorem 5 : Assume that Re A [A] < 0, Then E' is asymptotically stable in the mean 

square if: 

( sup ||G(JU)||)( sup | lF(t)l I) < 1. . 

— ot^O^oo — t <°° 

Unfortunately it does not appear to be an easy matter to express the above 

criterion as direct conditions on the original transfer function g(s) and the 

— —2 — —2 

functions kft) and q (t). In the case that k(t) or q (t) are time-invariant 
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however it is possible to obtain a criterion which is a great deal more specific: 
Corollary 1 : Assume that k(t) • k is constant. Then 1 * is asymptotically stable 
in the mean square if: 

( sup q 2 (t))(f (ce (A-kbc)t b) 2 dt) < 1 . 

-«< t<OD J 0 

Proof : The equation for T may be modelled as the feedback system: 

0 • (A- kb c ) 0+0 (A- kb c) 1 + bwb * ; z * cOc* , 

2 

w • q (t)z . 

The first system has (ce^ A kbc)t^2 as impose response. Since this is always 
nonnegative it follows that itB Fourier transform attains it maximum for w - 0. 

Since this maximum is given by f (ce^ A ^ C ^ C b) 2 dt we obtain the corollary by 

Jo 

applying the circle criterion in the scalar case, ® 

2 2 

Corollary 2 : Assume that q (t) • q is constant and let 

G(s) - (c® X+I ® c)(Is-A® I-I ® A - q 2 Cb ® b)(c® c)) _1 (b® I+I <g b) . 
then E' is asymptotically stable in the mean square if: 

(i) 

and (ii)( sup k ( t ) ) ( sup ||C(ja))!|)< 1 

Proof : The equation for T may be modelled as the feedback system: 

q . AO + 0A f + bcOc *b ’ + bv* + vb ' ; y - CO , 

v «» -k(t)y . 

The first system has 0( s) as transfer function and is stable if condition (i) 
is satisfied. The corollary thus follows from the multivariable circle criterion 
(see Brockett [1970], Section 33).® 

Notes : 8. The conditions of Corollary 1 may be expressed in terms of frequency- 

domain data. They then lead to conditions very similar to the deterministic circle 
criterion (see Willems and Blankenship [1971]). 

9. J.L. Willems [1972] has obtained a number of criteria for svstens as the one 
studied here. His criteria which are in the vein of Corollary 1 are sharper and 


-2 

q 


f (ce At b ) ‘ 


dt < 1 



more explicit than those studied here. 

10. It is well-known that the circle criterion gives the best conditions which 
may be proven by means of a quadratic Lyapunov function. However in the case under 
consideration one can obtain results by using "linear" Lyapunov functions. Indeed, 
one may view the equation describing T as a differential equation on the space ? of 
nonnegative definite symmetric (nxn) matrices. Restricting our attention to this 
subset of the vector space S of symmetric (nxn) matrices does not buy us anvthing 
as far as stability is concerned (i.e. stability on P is equivalent to stability on 
S). However it enhances the likelihood that a particular function will be definite 
and thus greatly enlarges the class of Lvanunov functions. For example the function 
Trace [PT] with P = P* > 0 is positive definite on P but not on S. It hence defines 
a suitable Lyapunov function for studying the mean square stabilitv question. This 
method is exploited in Willems [1972], 

CONCLUSIONS 

We have presented here a number of results on the stability of linear svstems 
with stochastic coefficients. Two average value criteria for almost sure stability 
were derived and we showed how one may use deterministic stability results like the 
multivariable circle criterion in order to obtain mean square stability criteria in 
the case the stochastic parameters are white noise processes. 
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Abstract 


Because many systems of practical interest fall outside the scope 
of linear theory it is desirable to enlarge as much as possible the 
class of system for which a complete structure theory is available. 

In this paper a class of finite state sequential systems evolving in 
groups is considered. The concepts of controllability, observability, 
minimality, realizability, and the isomorphism of minimal realizations 
are developed. 

Results which are analogous to — but differ in essential details 
from — those of linear system theory are derived. These results are 
potentially useful in such diverse areas as algorithmic design and 
algebraic decoding. 
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1. Introduction 

The purpose of this paper is to discuss certain questions related 
to the modeling of the input-output behavior of dynamical systems. 

We work in the context of systems with finite input, output, and state 
sets which admit group operations. The motivation for this study comes 
from a desire to understand better the key results in linear system 
theory (linear sequential machines included), and, more importantly, 
it comes from a desire to embrace in an analogous theory a broader class 
of input-output models than has here- to-f ore been possible. Our results 
are potentially useful in optimizing the basic recursions occuring in 
certain elementary numerical processes, the mechanization of algebraic 
decoding procedures, etc. 

This paper might be regarded as a contribution to the investigation 
of system theory in the context of universal algebras. It does not 
include the vector space results as a special case but it does shed 
new light on the previous proofs in that context, in that it makes clear 
which results depend only on the additive group structure inherent in 
a vector space. We have not worked for the weakest hypothesis for each 
individual theorem but rather have sought to place all theorems in a 
common framework — one motivated by linear theory. 

Thus, a number of the results and proofs have direct analogs in 
linear theory, and the proofs are presented to emphasize the universality 
of these arguments. That is, one should read these results keeping the 
following in mind. In the theory of algebra, there are a few basic 
isomorphism theorems for groups, rings, vector spaces, etc., and one 
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ob tains the results in one setting from those in another simply by 
replacing the key words with their analogs - e.g. group for ring and 
normal subgroup for ideal. The results here indicate that the same 
type of universal structure and isomorphism results will hold in a system- 
theoretic framework. 

One of the most difficult steps in constructing a realization of 
input-output maps is the state assignment problem. This step is crucial 
in the design of recursive algorithms, filters, etc. One of the 
essential features of our work i9 that we give a recipe for solving some 
problems of this type. 

2. Finite Group Homomorphic Sequential Systems 

Of course an empirical theory should avoid making assumptions 
which cannot be verified experimentally. However it is nonetheless 
useful to be able to anticipate the consequences of various assumptions 
about the internal mechanism of a phenomena under study, even if we are, 
in principle, incapable of verifying or denying the assumptions on the 
basis of experimentation. In this paper we want to investigate 
the properties of certain finite state systems which evolve in state 
spaces which admit a group structure and we verify in a constructive 
way the existence of this structure given the input-output data. 

Specifically, we consider a class of dynamical models of the form 

x(k+l) * b[u(k) ] o a[x(k) ] ; y(k) ■ c[x(k)] 

where the input, output, and state spaces are the finite groups 

■ (U,-)> (Y,*), SIC » (X,o), respectively. The maps a : ST -+SC, 
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b : fyl* 3V and c : are assumed to be group homomorphisms . 

Invoking an analogy with linear sequential systems, which are a special 
case, we call this a finite group homomorphic sequential- system . 

This class of systems has manv things in common with discrete time linear 
systems. The most obvious is the following result. 

Theorem 1 : The input, initial state, and output of a finite group 
homomorphic sequential system 

x(k+l) « b [u(k) ] o a[x(k) ] ; y(k) « cfx(k)l 

are related by 

x(k) • b [u(k-l) ] ° a[b[u(k-2)]] . ... ° a k 1 [b[u(0)]] o a k [x(0)] 

- Ol a k ~ i_1 [b [u(i) ] ] } o a k [x(0)] 
i»0 

y(k) = c[b[u(k-l]l* c[a[b[u(k-2)]]]*...*c[a k " 1 [b[u(0)]]]*c[a k [x o ]] 

A k ~l , , , 

** { n c[a K-1 ~ J ‘[b [u(i)]] ] }*ca lC [x(0) ] 
i=»0 

k 

where a denotes k compositions of a with itself. 

Proof r This result follows directly from the system equations and 
the fact that a and c are homomorphisms. 

Realizability Criteria 

In this section we give necessary and sufficient conditions for 
an input-output map to have a sequential realization of the type under 


consideration here 
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Recall that a sequence of linear maps of E® into E q is realizable as 
the weighting patterns of a finite dimensional discrete time linear 
system if and only if the sequence satisfies a linear recursion. 

What we find here is that a sequence of homomorphismsof into 
is realizable as the "weighting pattern" of a finite group homomorphic 
sequential system if and only if the sequence satisfies a homomorphic 
recursion. 

Let (U,*) and ■ (Y,*) be finite groups. We then define 

Tift'S/) to be the finite set of maps of into is a 

semigroup under the operation 

(fg)(u) ■ f(u)*g(u) f,g e F(^, 30 

A r 

Suppose it is a homomorphism of ^ x ... (r factors) * into 

Then tt naturally induces a homomorphism ft of F(<£^;30 r into 
F(^,30 : 

ft (A^, . . . >A f ) (u) - IT (A^u), ... ,A r (u)) Vu , A 1 ,...,A r E F(<^,30 

Theorem 2 ^~: Let and be finite groups. Given a sequence of group 

homomorphisms T ^ -*■‘2/ , i *» 0,1,2 there exists a finite group 

and group homomorphisms a : 9C b : Ql and c : 9C .-*• such that 

^(0 * c[a i [b(*)]] 

if and only if there is an integer r > 0 and a homomorphism 


1 It has recently been pointed out to us that for the special case of abelian 
groups a realizability result is given in reference [6]. 
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✓ 


p : <& r ■+<& 


such that for i => 0,1,2,... 


*< T 1 Wl> ■ T i+r 

Proof : (Sufficiency) Suppose abch a homomorphism exists. We 
construct the analog of what has, in the context of linear system 
theory, been called the standard observable realization [1]. Consider 
the map of S' into itself defined by 


a . (x x ,x 2 , . • • + (* 2 * x^ , . . . , x r ,p (x^ , x 2 , • • • ,x^) ) 

This is clearly a homomorphism if p is. Now define b, taking 
u 2/ into < 3/ X by 


b : u -► (T^uKT^u),...,^^!!)) 

and again this is a homomorphism if each of the T's is. Define c 
taking & r into according to 

c : (y 1 »y 2 »-**.y r ) y x 

This too is a homomorphism. We claim that c[a*[b(*)]] «* T^(»). 
This is true because of the recursion given by p : 
c[b(->] = c(T o (-),T 1 («),...,T r _ 1 (.)) - T o (*) 

c[a[b (• ) ] ] = c(T 1 (.),T 2 (*),...,p(T o ,T 1 ,...,T r _ 1 )(.)) 

- c(T 1 (0,T 2 (-),....T r (*» ** T 1 (>) 

c[a r_1 [b (• ) ] ] = c(T r _ 1 (.),T r (-),...,T 2r _ 2 (-)) “ T r _ x ( * ) 


The rest of the relations follow in a similar manner by applying 


the recursion. 



(Necessity) Suppose that T^(*) * c[a*[b(*)]] for some set of homo- 
morphisms a,b, and c with a: 9C -*-^T>eing defined on a finite group. 

Since the set of all maps of 9C into itself is a finite set, we see that 
a r a a for some r > k ^ 0. Then a “a for all m 0. Then 
defining p as the projection onto the (k+l)st component of an r- tuple 

W ■ ^ 

we see that 

p(T 1 ,T 1+l - T t+k (-) - c[a i+k [b(.)n - cta i+r tb(.)]) - !*,,.(• 

We remark that the proof shows that the only sequences of homomorphisms 
{T i > which can be realized by a finite state system are those which are 
periodic after a finite number of terms (see figure 1). The next result 
shows that a is an automorphism if and only if there is no "tail." 


Corollary : Under the hypotheses of Theorem 2 
with a an automorphism if and only if T^_^ «* 


T 


k 


there exists a realization 
for some A and all 


k ® 0,1,2,... 


Proof : This follows from the fact that a is an automorphism of a 

k 

finite group if and only if a is the identity automorphism for some 
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In automata theory, one usually considers systems described by 

maps of the form f : U* -► Y where U* is the set of all finite strings 

of elements in U and f (u .) is the output of the system at time 

o n-± 

n following the application of the input string u , ...,u , (in this 

o n- 1 

order) • One can then ask which f's come from finite group homomorphic 
sequential systems . 

Theorem 3 : Given finite groups (U, # ) and (Y,*), and an input- 

output map f : U* Y. This can be realized as a finite group 
homomorphic sequential system if and only if: : °l/ •+& f defined by 


T ± (u) = f (u, e, . . . ,e ) 

i identity inputs 

are homomomhisms satisfying the conditions of Theorem 2^ and 

Proof : The proof is a straightforward calculation. £ 

Note that the second condition in Theorem 3 is equivalent to the 
following: if £ U* and the length of is k, then 

= f (w 2 )*f (oi 1 ,e k ) 
k 

where e e U* is the string of k identity innuts. 

For an input-output map f corresponding to a finite group homomorphic 

sequential systems, one should think of the map from U r 
x 

into Y given by 

r r * f(u o Vl> * VV}* T l (u r-2>*"-* T r-l (u o> 

Vl ■ f(u o Vr e) * T l ( Vl ) * T 2 (u r-2 ) *-"* T r (u o ) 

♦ 

Vl-'" 1 ’ - T r-l<Vl> n r<V2>*’"* T 2r-2< u o> 
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as being the analog of the map corresponding to the Hankel matrix. As 

will be shown, the number of elements in the image space of this map equals the 

number of states in the "minimal realization" just as the rank of 
the Hankel matrix determines the dimension of the state space of a 
minimal linear realization. 

4. Controllability, Observability, and Minimal Systems 

One of the crucial results in linear system theory is that a 
system is minimal if and only if it is controllable and observable 
and any two controllable and observable realizations of the same 
input-output map differ at most by a choice of basis for the state 
space. This result has a natural analog here but the analog of a 
related result, namely the fact that any input-output map which has a 
linear realisation has a controllable and observable linear realization, 

fails. This means we must characterize all those systems which have 
controllable and observable real! ration and this is done in Theorem g 

below. We note that finite dimensional vector spaces over the same 
field are isomorphic if and only if they are of the same dimension, 
whereas finite groups can have the same number of elements and not be 
isomorphic. Thus the state space isomorphism theorems are decideliy 
more interesting here. 

We say that the homomorphic sequential svstem 
x(k+l)=b[u(k) ] o a[x(k) ] ; y(k) * c[x(k)] 
which evolves in the group ^*=(X,°) is controllable from x^ e X if 
for any x^ e X there exists a sequence of controls in the input group 
such that the state is driven from x^ to x ^ by this sequence. The system 
is said to be controllable if it is controllable from all x e X. Two 
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states x^, X2 e X are said to be indistinguishable if, given any input 
sequence, the corresponding output sequences from the initial states x^ and X2 
are identical. Otherwise, x^ and X2 are said to be distinguishable , and an input 
sequence that yields different output sequences from x^ and X2 is said to distinguish 
between x^ and Xj. We call the system observable if any distinct pair 
of states are distinguishable. 

Theorem 4 r Consider the finite group homomorphic sequential system 

x(k+l) ** b[u(k)] o a[x(k)] ; y(k) = cfx(k)] 

with state group (X,°). Let e^ e X be the identity in 0 C , Then 
the system is controllable if and only if it is controllable from e x> 

The states x^ and X2 are distinguishable, if and only if the identity 
control sequence distinguishes between them. Also x^ is indistinguishable 
from X2 if and only if x^^^ * 8 indistinguishable from e x> 

Proof : These results are obtained by straightforward calculations. | 

Thus, as in the case of linear systems, the test for controllability 
reduces to a test for controllability from the identity, and the test 
for observability to a test for indistinguishability from the identity. 

The next theorem gives a formula for the set reachable from 
the identity and the set indistinguishable from the identity. 



- 66 - 


Theorem 5 : If the finite group homomorphic sequential system 

x(k+l) » b [u(k) ] o a[x(k) ] ; y(k) = c[x(k)] 

evolves in a group 3C = (x,9) with n elements then the set of states 
reachable from the identity is 

ffl « {b(u, ) o a[b(u«) ] 0 . . . o a n ^[b(u ) ] {u. , . . . ,u e U} 

1 ^ n 1 n 

= b (U) o ab (U) o ... o a n_1 b (u) 

The set of states indistinguishable from the identity is 

JC • Ker c(*) fi Ker c[a(*)] fl . . . O Ker c[a n ^(*) ] 

The set yt is not necessarily a group but y( is a normal subgroup of 0C . 
Proof : With respect to the reachable set, this result is immediate 
from the formula 

x(k+l) - b(u(k))o a[b(u(k-l))]° ... °a k ~ 1 [b(u(l))] o a k [x(l)] 

and the observation that because of the stationarlty of the system, 
any state reachable from the identity is reachable along a trajectory 
that contains no state more than once and thus is of length less than 
or equal to n. 


If the input sequence is a string of Identity elements then the 
output sequence from the identity state is simply a string of identity 
elements In ^ . If the output from the state x is to be indistinguishable 
from this string then it must happen that 
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n _i 

c(x) » c{a(x)] - ... " c[a ] = identity 

Can it happen that this set of equalities holds but c[a P (x)] + identity 

for some p £ n? Clearly not because for any x, a (x) * a^ (x) for 

some n * i > j >0 because there are only n elements in X. This means 

_ Ic 

that for any x and any positive integer p we have a (x) » a (x) with 
0 * k i n-1, where k, of course, depends on x and p. (Actually for 

n £ 2, we can replace n-1 by n-2 in the expressions for and 9 but 
while this is easy to prove for 01 * the result for C/i is more 
cumbersome and we have thus omitted it). 

To see that is a normal subgroup we need only observe that 
the map of SX into %/ n defined by 

x ♦ (c(x),c[a(x)],...,c[a n 1 (x)]) 


is a homomorphism and is its kernel. That £R need not be a subgroup 
of SV will be shown by example later. M 


Corollary : Under the hypotheses of Theorem 5 the set ^ is a subgroup 
if (X is an abelian group. 

Proof : We need only note that for all m ^ 0^ a^ (U) is a subgroup^ and 
that the product of two subgroups of an abelian group is itself a 
subgroup. £ 

We now recall some of the concents of abstract realization theory 
([2], Ch. 10). If A an^ w are sets and we have an input-output map 
f : A B, a factorization of f through a state set C is a pair of 
maps a : A C and 8 C -*■ B such that f = 8°ot - i.e. the following 
diagram commutes: 
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This factorization is canonical if a is onto and 8 is one-to-one. 

In this case, the ,: size" of C is minimal in some sense. For 
instance if A, B, and C are vector spaces and f, a, and 8 are 

A A 

linear maps, and if C, S, 8 is any other, not necessarily canonical, factorization, 

A A 

then dim C < dim C. Also, if A, B, C, and C are finite sets, with C 

A 

corresponding to a canonical and C to any other factorization, then card(C) £ 

A 

card(C). 

Suppose we have an input group <5/ => (U,*) an output group = (Y,*), 
and an input-output map f : U* *► Y that has at least one realization 
as a finite group homomorphic sequential system: 

x(k+l) = b [u(k) ] o a[x(k) ] , y(k) - c[x(k)] 
with finite state group 0C = (X,°). Suppose 0C has n elements, and 
define F : U* Y n by F(u q , . . . ,11^) = (f (u q , . . . ,u fc ) , f (u Q , . . . ,u, ,e) , . . . , 

F(u q , . . . jUj^e 11 ^)) . We then have a factorization of F : 



where 

^(u Q ,...,u k ) = b(u^) ° ab (u^_^) 0 ... 

m(x) = (c(x) ,ca(x) , . . . ,ca n ^(x)) 

We immediately see that the above factorization is minimal if and only if 
the system is controllable and observable. i n this case we say that the 
triple of homomorphisms (a,b,c) defines a minimal realization. 
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Another result of abstract realization theory is the following: 
given f : A -► B and two canonical factorizations - that is two sets 

A A 

C and C and corresponding maps a:A-*C,a:A + C, both onto, 

A A A A 

and 0 : C -*• B t ft : C -► B, both one-to-one, such that f : B°a = 8°a - - 
then the two are equivalent , in that there exists a unique one-to-one and 

A A 

onto map y : C -► C, such that a = y°a and f? = P°Y* 

When we apply this result to the problem of finite group homomorphic 
sequential systems, we obtain stronger results, as in linear theory, 
because of the structure of the systems. 

Theorem 6 : Suppose = (U,*) and (Y,*) are finite groups, and 

f : U* Y is an input-output map that has two controllable and observable 
finite group homomorphic sequential realizations 


x(fc4-l) = b [u(k) ] oafx(k)] 

; y (k) » cfx(k)l 

(1) 

z(k+l) = g [u(k) ]«f [ z(k) ] 

; y (k) = h[z(k) ] 

(2) 


where the system (1) evolves in a finite state group 3C * (X,°) and 
system (2) evolves in a finite state group % = (Z,*). Then there 
exists a group isomorphism p : 0C-+ gt such that f * pap \ g « pb , and 
h = cp The two realizations are said to be conjugate . 

Proof : Suppose the cardinality of is n. Then the same is true of 
$ by the comments preceding :the theorem. Let F : U* -*^ n , SB : U* +8C f 
: St -► 3^ n be as before, and define U* -► £ and q : £ n by 

^(u o ,...,u k ) - g(u fc ) ° f g(u k _ 1 )®... <, f k g(« 0 ) 
q(z) ■ (h(z),hf(z),...,hf n 1 (z)) 


and m 
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Then, by controllability and observability we have two canonical 
factorizations of F and the commutative diagram 



where p is the unique one to one and onto map such that the diagram 
remains commutative . 

Let x^,x 2 e X* Then we have 

q[p(x 1 «x 2 )] - m(x 1 ®x 2 ) = mCx^^Cxj) » q [p (xj) ]*q[p(x 2 ) ] = q[p(x 1 )«p(x 2 )] 

Since q is one-to-one p(x^°x 2 ) «* p(x^) o p(x 2 ). Thus p is an isomorphism. 

It is then a simple computation to arrive at the relation between (a,b,c) 
and (f »g»h) . ® 

Note that in the theorem, the group structure of <3^ is never used, 
however the group structure of and the fact that m and q are both 
one-to-one homomorphisms is used to show that p is an isomorphism. 

This lack of symmetry in the arguments is discussed in the next section. 

As was mentioned in Theorem 5,^3? - the set of states reachable from 
the identity - need not be a subgroup. Thus, given a finite group 
homomorphic sequential system, there need not exist a controllable system 
of this type with the same input-output description. In fact, one 
might expect that a homomorphic sequential system has a minimal realization 
as a homomorphic sequential system if and only if the set -33? of states 
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reac'nable from e x iSj in any particular realization, a subgroup. The 
example below shows that this need not be the case. If 0t is a subgroup, 
we can restrict our hotnomorphisms to 01 , modulo the kernel of 
(c,ca, . . . ,ca n *) : SC -*■ l 3 / Xi , and thus construct 
a controllable and observable homomorphic realization (a simple 
check shows that one can redefine the hotnomorphisms in a well-defined 
manner after extracting the kernel — therefore there always exists an observable 
homomorphic realization). Thus, for example, if there exists a homomorphic realiz- 
ation with an abelian state group, there exists a controllable and observable homo- 
morphic realization. 

An example will illustrate these ideas. The dihedral group, 

D , is a group of order 2n generated by two elements x and y which 
n 

satisfy the relations 

n 2 

x ■ e, y ■ e ; xyx - y 

where e is the group identity. The cyclic group of order n will be 

denoted as 2 , and its elements are {0,1, . . . ,n-l). Consider the finite 
n 

group homomorphic sequential system 

i(icti) - Mu(k)) . »[x00] ; y< k > * 

wh.re » • Z 2 . « • V »• »• ”" d c are h0 " 0 “ rph1 * 0 ' 

uniquely determined by 

b(l) - y 

a<x) » e, a(y) = xy 
c (x) - 0, c(y) - 1 
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The set of states reachable from e may be shown to be 

0t = {e,y,xy,x 3 } 
which is not a subgroup. 

However if we compute the input-output homomorphisms «* caSj^ 
we find that 

■ identity for all k > 0 


Although the above nonminimal realization has an identity - reachable 

set which is not a group, there still exists a minimal homomorphic 
sequential system. In fact such a realization is found by taking 
q/m ft; o fy = z 2 and a ■ b - c ■ identity. The reason we can find such 
a realization is that our original system is not observable. It is easy to see that 
there exists a controllable and observable homomorphic sequential realization of a 
given input-output map if and only if the identity-reachable set in any particular 
observable realization is a group. An example of an observable system for which is 

not a group is found by modifying the previous example. Let 01 , 0T , a, and b 
be as above, but let ft? «* and c ■ identity (i.e. state output}. This 

is observable, and 01 la the same as before. 

There are conditions under which 01 is a subgroup, in which case we 
do have a controllable and observable homomorphic realization. The following 
theorem indicates one such condition. 

Theorem 7 : Under the hypotheses of Theorem 5 the set 0t of states reachable 
from the Identity is a subgroup of if a is an automorphism. 
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Proof : The group of automorphisms of a finite group is itself a finite 
group with function composition as the group operation. Thus there 
exists a k > 0 such that 

It 

a = Identity automorphism 

From Theorem 1 we see that the set SR of states reachable from the 
identity can be written in the form 

SR - U n a’^OJ) 

m >0 i“0 

- U [b (U) o ab (D) o . . . o a^H) (U) J* 

>m> 1 

where U is the input group and for HCX 

h“ " (h^ o h 2 o ... ° hjh e H} 

Thus if x, y z SR , we have that x € [b(U) o ab(U) ° ... ° ak“H(U) 

and y e [b(U)°ab(U)o ... o *b(U)] m 2 for some m and nu. Then 

. . m.+o 1 

xoy€ [b(U) o ab(U) o ... o a "^(U)] . We see that for all 

n>0 x z SR If x e SR . Since St' is a finite group, there exists an 
N > 0 such that x = x . Therefore 3? is a subgroup. ^ 

The next theorem completely characterizes those seauences of input- 
output homomorphisms which have controllable and observable finite-group 
homomorphic sequential realizations. To do this, we mu6t define what 
we mean by a free response of a system. If a system is given 
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in resursive form (as our first equation)^ a free response is the identity-input 
response of the system from some initial state. If the system is 
given in input-output form, it is the response to an input sequence 
which consists of the identity only> from some point onward^ and where 
the response is observed from the point in time where the non-identity 
Inputs stop. Thus we apply a (possibly) non-identity input up to time 
k and record the output from time k+1 on. Note that the set of 
free responses of an input-output map corresponds to the set of free 
responses of a homomorphic realization of that map started in a state reachable 
from the identity state. In what follows^ free responses refer to the input- 
output system description. Note that we can consider the set of free 
responses as a subset of the infinite direct product group & x x ... x <$/ x .... 
Theorem 8 : Let the sequence of homomorphlsms T^ s 9J , i»0,l,2, . . . , 

with and ^ finite groups^ satisfy the hypotheses of Theorem 2. 

Then there exists a controllable and observable finite group homomorphic 
sequential realization if and only if the set of free responses form a 
subgroup of the infinite direct product group. 

Proof : (Sufficiency) Let gF be the group of all free responses. 

Let 3F be defined as follows 
n 


^n 


<V y i Vi ] e ^ r 


y o ,y lf . . . ,y n _ x are the 
first n elements of a . 
free response e £F 




Obviously ^ is a subgroup of < y n if gF is a 
subgroup of the infinite direct product. 
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Consider the standard observable realization given in the proof 
of Theorem 2. In that realization, the state space is , 
and it is easy to see that the set of states reachable 
from the identity is just ^ . Then, restricting our homomorphisms to 
we have a minimal homomorphic realization. 

(Necessity) Suppose we have a minimal homomorphic realization of the T^: 

x(k+l) = b[u(k) ] o a[x(k) } ; y(k) - c(x(k)] 

Since every state is reachable from the identity, the set of free 
responses in the input-output sense is identical to the set of free 
responses in the state space sense. Consider the map from 3X into the 
infinite direct product group given by 

k 

X (c(x) ,ca(x) , . . . ,ca (x),...) 


This is obviously a homomorphism, and its image is IP , which therefore 
must be a group. ® 

Corollary : Under the hypothesis of Theorem 8, if (P is a group, .P is 
isomorphic to !P for some n. 

Proof : Suppose a is the state transition homomorphism for a minimal 

k p 

realization. Then there exist k > p * 0 such that a * a , and then 

(c(x) ,ca(x) , . . . ,ca n (x) , . . .) - (c(x) ,ca(x) , . . .ca k ^(x) ,ca p (x) , . . . .ca^ 1 (x),... ) 
and the isomorphism is obvious. Note that even if (P is not a group, there 


exists an n such that the elements of and 
correspondence. 


JF are in one-to-one 
n 

i 
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5. Some Comments on State Space Reduction 

A number of questions were raised in the preceding sections. 

We have derived the standard observable realization - what about a 
"standard controllable realization" in the sense of reference [1]? 

The set of states indistinguishable from the identity is a (normal) 
subgroup - why isn't the set of states reachable from the identity a subgroup? 
In Theorem 6 we used the fact that m and q are homomorphisms - what about 
S3 and ® ? We have seen that 3ft need not be a group, and for similar 
reasons 33 and *3 aren't homomorphisms and there is no standard controllable 
realization. 

Note that these difficulties arise from the following consideration. 

Suppose we have a set of homomorphisms c^, i=l,2,...,m mapping a 

finite group St into a finite group ?/ . Then the "fan out" map taking St 
into 

x + (c. (x) , • . • ,c (x)) 
x n 

is always a homomorphism. 

but the "fan in" map taking into 

(x^-.-.x^) -*■ c i( x x> ’ c 2^ x 2)** * * c n ( x n ) 

need not be a homomorphism . (For 

example, the map of SX* SX~ into SX" defined by group multiplication is 
typically not a homomorphism). 

In the rest of thi9 section, we will discuss these problems in some 
depth. We will also present some additional conditions wM.c.h enable us to 
circumvent some of the difficulties. 

Even if SR is a group, we cannot be sure that the map SR is a homomorphism 
If 0C has n elements, then the map defined by 
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jgzty/x ... (n times) 

u , ...,u .) ■ b(u .) o 4k(u .) * ... ° a n *b(u ) 

o n— J. n- ± n- z o 

is onto. We would like to investigate putting a semi-direct product 

structure on Wx ... x<2/ in order to make 93 a homomorphism. We have 

the following necessary condition: 

Theorem q : Consider a finite group homomorphic sequential system. If 

there exists a semidirect product structure on ... (n times) 

such that 93'. °l/x ... xW -*■ 9T is a homomorphism, then the set of 
states reachable from the identity in k steps is a group for all k > 0 . 

Proof : Choose k e {l n}. Consider the set of input strings 

^k = {(e u’ k ’ u o Vi ) lv ,, “Vi eD} 

For any semidirect product structure on ... xtyl , this is a 

subgroup. Thus 93 (?^) is a subgroup if 93 is a homomorphism and 
93(@fr j t ) is just the set of elements reachable from the identity in k 
steps. For k > n use Theorem 5. 

We now modify the earlier example. We concern ourselves with 
the input-state side of the system only. Again let °U m ^4 * 

and let b be as before, but redefine a by 

a(y) = xy , a(xy) - y 

It is easy to check that a is an automorphism of D^, and thus by 
Theorem 7 9R is a subgroup. However 

^^) - {e,y,xy,x 3 } 
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which is not a group, and thus is not a homomorphism for any semi- 
direct product structure on 

These examples illustrate an asysmetry in the theory. Unlike linear 
system theory - or even the abelian group case here, where it is clear 
that none of these difficulties appear - we do not have a naive duality 
theory without additional assumptions. 

An assumption that avoids some of these difficulties 
is that of requiring a to be a normal endomorphism. A homomorphism f 
of a group ® into itself is called a normal endomorphism if for all 
x, y e ® 

xf(y)x * “ f(xyx S 


Theorem 10 : Consider the finite group homomorphic sequential system 
x(k+l) - b [u(k) ] o a[x(k)] ; y(k) « c[x(k)] 


evolving in a finite group SC of order n. Suppose a is a normal 
endomorphism. Then there exists a semidirect product structure on 
^/x ... x<?/ (n times) such that the input-state map 03 is a homo- 
morphism, and thus the identity^reachable set St is a subgroup. 

Proof : Define the binary operation on S/x . . . (n times) 


<V U 1 Vi )( V v i Vi> ■ 

< V I 1 - V i 1 - • • ■'VlV.-l'- • • • v 2 v l v o’ v 2 lv 3 1 ' • • • 'Vl’lVl' • • • v 3 v 2 v r 


■•••Vl u n-2 V n-lV2’VlVl ) 


* 



Direct computation verifies that this does define a semi-direct 
product structure on ... (n times), and another computation, 

using the fact that a is normal verifies that J is a homomorphism, 
Thus, in this case, we can reduce our system to a minimal homo- 
morphic realization by first restricting the homomorphisms to SR and 
then taking 31 modulo the kernel of m, the state-output map (see 
Theorem 6) . We then have the following canonical factorization of 
the input-output map m«35 

m SB 

&*...* 3 







where £ is the reduced state group, and SB' and m' are the reduced 
input-state and state-output homomorphisms, with ,3?’ onto and m' one 
to one. 

Another question arises in the case where SR is not a group. When 
this happens, we have x^, X 2 C ^ suc ^ *hat x j° x 2 ^ ^ • Thus this 

particular group multiplication never occurs in the operation of the 

system and is irrelevant information. One can then ask whether or 
not we can redefine these irrelevant multiplications in such a manner 
as to make SR a group, while at the same time requiring that a,b, and 
c remain homomorphisms when restricted to SR . The example given 
previously shows that, at least in some cases, this can be done. Again 

let = 8£ = with a,b,c defined by b(l) ■ y; a(x) « e. 
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a(y) ■ xy; c(x) » 0, c(y) « 1. We saw that 

{e,y,xy,x 3 } 

3 3 

The superfluous multiplications are (xy) o y, (xy) o x , x o y, and 
3 3 

x ° x . If we define these as follows 

, . A3 3 A 

(xy) o y = x x o y = xy 

, , 3A 3 3 A 

(xy) ox=y xox»e 

then £7t is the Klein-4 group, and it is easy to check that a, b, and c 
are still homomorphlsms . In fact, since the Klein-4 group is abelian, 
a is a normal endomorphism^ and we can reduce our system as described above. 

6. Conclusions 

In this paper we have considered a broader class of input-output 
relations than those found in linear system theory and have derived results 
analogous to some of the more crucial properties of linear systems. 

In particular, we have considered dynamical systems of the form 

x(k+l) - b[u(k) ] o a[x(k) ] ; y(k) ® c[x(k)] 

where the input, state, and output spaces are finite groups, and 
a, b, and c are homomorphlsms. The concepts of controllability, obser- 
vability, and minimality are developed, and conditions for the realization 
of an input-output map by such a system are given. As in the linear 
case, the equivalence of any two minimal homomorphic realizations is 


established. 
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In addition, several problems, all directly or indirectly 
related to duality, arise in considering this broader class of systems. 
These are discussed, and it is shown that an additional assumption 
removes these problems. 

The analogy with linear theory has by no oe fl ns been completely 
exploited. Concepts such as transform theory have not been considered 
at all. Also, extensions of some of these results to infinite group 
problems can be made, possibly making contact with the study of 
dynamical systems on topological groups [7]. 
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Abstract 

We show in this paper that in constructing a theory for the most 
elementary class of control problems defined on spheres, some results 
from Lie theory play a natural role. In particular to understand con- 
trollability, optimal control, and certain properties of stochastic 
equations, Lie theoretic ideas are needed. The framework considered 
here is probably the most natural departure from the usual linear system/ 
vector space problems which have dominated the control systems literature. 
For this reason our results are compared with those previously available 
for the finite dimensional vector space case. 
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1. Introduction 


Specific results about control systems whose state spaces are 
spheres have been useful in understanding problems in energy conversion, 
controlled rigid body dynamics, etc. Some examples are mentioned in 
our earlier paper [1]. Here we work out in more detail, and in greater 
generality, the theory for a class of problems of this type and compare 
out results with the case where the state space is a vector space. To 
carry out this program requires some results from Lie theory, Lie groups 
acting on spheres, etc. There has been no attempt here to discuss the 
most general setting in which techniques which we use are applicable. 

Instead we have taken the sphere problems as a model and have studied a rarge 
of control- theoretic questions in that setting. A number of possible 
generalizations will be apparent. 

To begin with we mention some well known facts about linear system 
theory. We do this to make the paper a little more accessible to those 
not familiar with control problems and to sensitize the reader to certain 
issues important in control. For a more complete account and references 
to the literature one can consult [2] for the deterministic results and 
[3] for the stochastic results. 

Linear system theory deals with the pair of equations 

x(t) = Ax(t) + Bu(t) ; v(t) = Cx(t) (1.1) 

where x denotes a time derivative. It is assumed that x(t) £ 7K n , u(t) £ TK™ 
and y(t) £ For simplicity we take A,B,C to be constant matrices. 

One calls u the control , x the state and y the output . The theory of linear 
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i 


9 


system is extensive but for our present purposes we point out only 
the following five results. 

i) (1.1) is said to be controllable if for every x q and x^ in 7^ n 
and every t^ > 0 there exists a piecewise continuous control u(*) such 
that if x(0) » x q then x(t^) = x^. A necessary and sufficient condition 
for controllability is that Rank(B,AB, . . ,A n ^B) = n where , indicates a 
column partition. 

ii) (1.1) is said to be observable if for every x^ 4 and every 
t^ > 0 the outputs corresponding to x^ and X 2 differ on the interval 
[0,t 1 ]. A necessary and sufficient condition for observability is that 
rank (C;CA;...CA n ^) = n where ; indicates a row partition. 

iii) If (1.1) is controllable then for every given x q and x^ in '7F? n 
and every t^ > 0 there exists a piecewise continuous control u defined on 
[0,t 1 ] which transfers the state from x Q st t ■ 0 to x^ at t ■ t^ and 
minimizes 


n(t) 



u' (t)u(t) dt 


( 1 . 2 ) 


relative to all other piecewise continuous controls which accomplish 
the same transfer. 

iv) If there exists a linear feedback control law u ■ Fx such that 

x • (A+BF)x has a null solution which is asymptotically stable then there exists a 

control law u ■ Kx such that lim x (t) = 0 and the functional 

t-x» 

n - 

is minimized by setting u(t) = Kx(t). 


J 

J n 


u * (t)u(t) + y'(t)y(t)dt 
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v) If (1.1) is controllable and if the differential equation x » Ax 
is asymptotically stable then the associated stochastic equation (for 
notation see [3]). 

dx(t) = Ax(t)dt + Bdw(t) (1.3) 

has a unique invariant Caussian measure which has zero mean and variance 
Q satisfying 

OA + A'O ■ -BB' (1.4) 

In this paper we establish analogs for each of these results for 
systems of the type 

m 

x(t) = (A + l u i (t)B i )x(t) ; y (t) - Cx(t) (1.5) 

i*l 

where A,B^,B 2 » . . . ,B m are skew symmetric matrices and the system can be 
thought of as evolving on the sphere l|x(t)|| = I | x (0) | j . 

One significant point in the linear theory Is that the matrix B is 
generally not invertible and cases for which it is invertible are so infrequent 
as to be virtually without interest. If B is invertible then by an 
appropriate choice of basis equation (1.1) becomes 

x(t) = Ax(t) + u(t) (1.6) 

and controllability is automatic. Moreover, in this case problems iii) 
and iv) are easily reduced to variational problems of the classical type 

H = f 1 L(x,x)dt (1.7) 

h 

with L quadratic in x and x and positive definite. Control theory 
works with the more general "degenerate" case where L . . is only nonnegative 
definite but certain constraints are in effect. If the above integral is 



thought of as the action integral in a mechanics problem then the case 
treated in control theory allows for the possibility of certain zero 
masses provided there are appropriate linear constraints between position 
and velocity. It can also be thought of as a limiting case of an uncon- 
strained dynamical problem where certain masses and associated energies go 
to infinity.^ This second interpretation is generally more useful. Remarks 
of the same type apply to equation (1.3) where existance of a smooth 
transition density is well known if B is invertible whereas the same is true, 
but for rather more subtle reasons, if we assume controllability instead 
of invertibility of B. 
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2. Controllability 

One of the main areas of applicability of Lie theory in control has 

been that of determining the set of points reachable along solution 

curves of x(t) * f (x(t) ,u(t) , t) for the set of all piecewise continuous 

controls u(*)* For studies of this kind see references [4-10]. If the 

control equations are of the form 

m 

x(t) - (A + l u i (t)B i )x(t) ; x (t) e (2.1) 

i=l 

then the system typically evolves on a manifold in IF?*. The determination 
of the set of points reachable from a given point x q can be accomplished 
by the determination of the set of matrices reachable from the identity 
for the matrix equation 

m 

X(t) « (A + l u (t)B )X(t) ; X(0) - I (2.2) 

i=l 1 

and then letting this set act on x q via ordinary matrix-vector multiplication. 
Equation (2.2) can be thought of as defining a control problem on a matrix 
Lie group. The q u 6 st i° n of determining what matrices are reachable from 
the identity along solutions of (2.2) has been the subject of a number of 
papers [1, 7-10]. Following Jurdjevic and Sussmann, we term systems of the 
form of (2.2) right invariant . This is appropriate because the vector fields 
defined on the 0£(n) by the right side of (2.2) are invariant under the trans- 
lation defined by right multiplication with an element of GZ(n) . We will say 
that equation (2.2) is controllable on a group ® if any two points in ( S can 


be joined by a solution curve generated by some piecewise continuous control 
«(•)• 
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Suppose that A and ®i»®2’‘‘'’®m are all skew symmetric. Then 

regardless of the choice of u the solutions of equation (2. J 1) remain 

on the sphere defined by ||x(t)|| = ||x(0)||. We will say that the 

system (2.1) is controllable on the sphere if any two points on the sphere 

be joined by a solution curve generated by some piecewise continuous 

curve u(*). Phrased another way, the system is controllable if the set 

of matrices reachable from the identity along solutions of (2.2) act 

transitively on S n From earlier results [10] we know that since the 

motion is confined to a subgroup of S0(n) the set of matrices reachable 

from I is the matrix Lie group consisting of all the matrices which can 

be expressed as products of the form exp H. exp H_,...expH where H, , 

l / n 1 

Hjyo'tH belong to the Lie algebra generated by A,B^ • • »B . 

n- ^ 

Now of course the orthogonal group S0(n) acts transitively on S 

so that if the algebra generated by A,B. ,B_, . . . ,B is the full set of 

1 c in 

skew symmetric matrices then the system (2.1) is controllable on S n 
However there are certain subgroups of S0(n) which act transitively on S n ^ 
as well. The real compact forms of the classical Lie groups are all 
candidates. The results are well known [11] but we repeat them here. 

For example, it is clear that both the full unitary group and the special 
unitary group of dimension n act transitively on the set of complex n-vectors 
whose Hermetian length is one. But this set is just a set of vectors with 
components (x^+/^T y^) such that 



which is a 2n-l dimensional sphere. Thus by defining the reallficatlon [12] 
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of the unitary algebras by the Lie algebra homomorphism 


(2.4) *> 

we obtain a set of real matrices whose associated group acts transitively 
2n-l 

on S . The real compact form of C is the intersection of special 

n 

unitary group and the symplectic groups. Naturally this representation 

is in terms of matrices of even dimension so that they can act on even 

dimensional complex vectors only. Thus, by analogy with the unitary case, the real 

4 n _ ^ 

compact form of acts on the sphere of dimension S . This action is known 
to be transitive and of course we can add to the algebra real multiples of /-T I 
to get the "full quaterion-unitary group" which acts transitively as well. 

These four cases^ each valid for all integer n, together with three particular 
ones account for all possibilities. The particular cases may be explained 
as follows. The exceptional algebra Gj admits a 7 dimensional skew- 

g 

symmetric representation whose exponential acts transitively on S . The 
spin representation of SO (7) is 8 dimensional and it acts transitively 
on S 7 . The spin representation of SO (9) is 16 dimensional and it acts 

transitively on S*^. With this explanation we can state the following result. 

Theorem 1 : Let A,B^,...B m be a collection of n by n skew symmetric 

matrices. The control system 

m 

x(t) = (A + l u i (t)B )x(t) (2.5) 

i*l 

n— 1 

is controllable on S n_1 if the algebra generated by A.B^B^ . . . ,B m is 

i) S0(n) for n » 0 mod 2 • «. 

ii) S0(n) or the reAlif ication of SU(n/2) or U(n) for n = 1 raod(2) 

iii) The realif ication of Sp(n/2) for n = 1 mod(4) 

iv) if n = 6, Spin (B) if n • 7 or Spin (16) if n ■ 15 


Bf- 


ReB 

ImB 

-ImB 

ReB 



Moreover, if the Lie algebra is not one of these cases the system (2.8) 
is not controllable. 

ti"* 1 

If the system is not controllable on S it is sometimes of interest 
to compute exactly what points can be reached from a given initial state. 

The determination of what points belong to this set is facilitated by 
a knowledge of the structure of the representation defined by the matrices 
in the algebra generated by A, # # ^ this representation is not 

irreducible then its reduction is clearly the first step in the determination 
of the reachable set. The properties of the irreducible pieces may reveal 
the form of the reachable set in a straightforward way. For example, if 
the evolution equation can be decomposed as 

1 f\ m 19 

x = [I ® A + A (x) I + £ u (I ® B + B . ® I) ]x(t) (2.6) 

i-1 

then the Kronecker product of the reachable group for 

X(t) = (A 1 + l u (t)Bj)X(t) (2.7) 

1=1 

and the reachable group for 

_ m - 

X(t) = (A z + l u (t)Bj)X(t) (2.8) 

1=1 1 

contains the reachable group for equation (2.2). The reachable group will 
not> ±6 general^ simply be the Kronecker product of the reachable groups unless 
the effects of the u’s are decoupled. 

For the linear evolution equation (1.1) it happens that if it is possible 
to transfer any state to any other state then this transfer can be done 
in arbitrarily small time. This is not the case for systems defined by 
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equation (2.1). Jurdjevic and Sussmann [9] give an example of a system 
2 

defined on S which is controllable but certain transfers cannot be 

made in less than 1 unit of time. Thus if (1.1) is controllable on S n ♦ 

the strongest statement we can make on the basis of the present analysis 

is that for t^ sufficiently large every state can be transferred to every 

other state in t^ units of time. Estimates on this time have not yet been 

worked out. 

In the vector space case controllability is closely related to the 
concept of observability as mentioned in the introduction. In the present 
setting this is not the case at all. We say that the system 

m 

x(t) « (A + l u. (t)B.)x(t) ; y(t) » Cx(t) (2.9) 

i-1 

is observable on S n ^ if no two distinct initial states on S n ^ give rise 
to the same response y for all controls u(-). The following theorem gives 
a necessary and sufficient condition for observability. 

Theorem 2: Let A, B,,B 0 ,...,B be a collection of skew symmetric 

X e l in 

matrices and let c be a unit vector. The control system 

m 

x(t) = (A + J u (t)B )x(t) ; y (t) = cx(t) 

i=l 

is observable on S n ^ if and only if the set of matrices {A,B. ,B., . . .B ,cc'} 

1 / m 

are irreducible. 

For a proof of this theorem and more general results of this type 


see [13]. 
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y 3. Optimal Control 

Consider again the evolution equation (2.2) defined on matrix 

group Let there be given a time t^ > 0 and boundary conditions 

of the form X(0) » X ; X(t,) = X,. Suppose that in addition there 

o 1 1 

is given a functional which is of the action type 

n, “ 4 [ 1 l u 2 (t)dt (3.1) 

L J t i=l 
o 

as opposed to the geodesic type 

n 2 - f 1 ( l u. 2 (t)) 1/2 dt (3.2) 

1 J t i-1 

O 

Our problem is to determine if there exists a control u(*) such that 
the boundary conditions are met and the given functional is minimized and, 
if such a control exists, to characterize it. Just as with controllability, 
there is an obvious connection between problems defined on a group and problems 
defined on a manifold on which that group acts. This would no longer be 
the case if r\ dependend on x in a general way. 

We will use the formalism of the maximum principle of Pontryagin [14] 
rather than the calculus of variations to attack this problem because it 
handles the degeneracy which is built into the problem in a natural way. 

Applied to the present problem, Pontryagin f s maximum principle asserts that 
if u(*) is an optimizing control then there exists a matrix P such that 

m 

c P(t) - -A’P(t) - l u t (t>B'P(t) ( 3 . 3 ) 

i=l 

and defined by 

m tn - y 

H(P,X,u) - <P,AX> + l u <P,B X> + l ± u . 

1=1 1 i-1 


(3. A) 
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v 


The problem can be reduced to a single quadratic equation with split 
boundary conditions by introducing K > XP'. An easy calculation shows 
that 

K(t) = AK(t) -K(t)A' - l <B' K(t)>(B K(t)-K(t)B’) (3.7) 

i=l 1 : : 

So far everything is valid for an arbitrary subgroup of G5.(n) . If 

. . .B m are self contragredient then a simplification occurs. 

In that case any solution of the differential equation for P can be 

expressed in terms of a solution of the differential equation for X with 

nonsingular boundary conditions; ie.P(t) ■ NX(t)M for some constant matrices 

M and N. Specializing to the skew symmetric case gives the following result. 

Theorem 4 ; Suppose that A,B^,B 2 » . • .B^ are skew symmetric n by n matrices 

and suppose that there exists a piecewise continuous control u(*) which 

transfers the state of the matrix system 

. m 

X(t) - (A + l u. (t)B )X(t) (3.8) 

i=l 1 1 

from X at t * 0 to X. at t = t, > 0. Then there exists constant 

O 11 

matrices M and N such that the solution of 


is minimized with respect to u by the optimal control. Thus we have the 
optimal control given by 


u t (t) = <-P(t),B 1 X(t)> 


(3.5) 


This choice of u gives a pair of differential equations with split 
boundary conditions 


X(t) A 0 X(t) m B ° X(t) 

= l <P ,B.X> i 

P(t) 0 -A' P(t) i=l 0 -B^J [P(t) 
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m 

* X(t) = (A + l <B 1 ,X(t)MX , (t)N>B 1 )X(t) ; X(0) = X q (3.9) 

i=l 

^ passes through at t = t^. Moreover, there exists one such pair 

M,N which minimizes relative to any other continuous u(*) which 
steers the system to X^ from X q in the same period of time. 

Proof: That there exists an optimal control follows from theorem 6 of 

Cesari [15]. The rest follows from the maximum principle as discussed 
above . 

There is an alternative point of view available for these problems 
which makes a little closer contact with both physics and Lie theory 
but which is not so useful here. Consider the right-invariant control 
equation in SO(n) with control 

X(t) = 9(t)X(t) ; X(0) = X q (3.10) 

Let the problem be to pick H in the space of skew symmetric matrices 
such that X(tj) = X^ and the trace form 

n ■ f ^-tr (I 1 fi) 2 dt (3.11) 

J 0 

is minimized. Elementary variational arguments with due regard for the 
admissibility of variations lead to the Euler equation 



n = nini _ 1 -i - 1 nif! 

(3.12) 


In SO (3) this matrix equation is equivalent to the familiar Euler 
equations for a rigid body 


* 

I l“l = ( I 2 -I 3^ w 2 a3 3 

(3.13) 


I 2 w 2 = 



13 ( 1)3 ** 




which, after all, come from minimizing the action Integral on S0(3). 

(Note that the kinetic energy of a rigid body can be expressed by 
the trace form (det I)tr(I *&)^ where I is the usual Inertia tensor. 

See [2] page 64. Incidentally, this also serves to define the degree 
of difficulty of actually solving the control problem mentioned above. 
Since it is well known that the solution of the Euler equations generally 
involves elliptic functions, the solution of the optimal control problems 
cannot be expressed in terms of elementary functions except in special 
cases. 

By far the simplest special case on SO(n) occurs when is the 
negative of the integral of the Killing form. That is given X(0) and 
X(l) and given the evolution equation 

n(n-l)/2 

X(t) = l u (t)B X(t) ; X e SO (n) (3.14) 

i=l 

where B. = -B! and for all i and 1 
i i J 

<B 1 ,B j > = tr B^ = 6 (3.15) 

one finds that the optimal trajectory is 

X(t) = e Qt X(0) (3.16) 

where & is the solution of e^ - X(1)X _1 (0) which has the smallest Frobenius 
norm. 

We turn now to applying the above results to the problem of 
optimizing trajectories on spheres* Note that trajectories on spheres can 
optimized for fixed end points by solving an associated right invariant 
group problem and then picking the minimizing element in the group for 
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& 


£ 




transferring x q to x^. The following theorem expresses thi 3 . 

Theorem 5 : Let A, . . . ,B^ be skew symmetric matrices. Suppose 

that the system 

n 

i(t) - (A + l u (t)B )x(t) (3.17) 

1=1 

is controllable on S n . Then given a sufficiently large time t^ > 0 and given points 

x q and in S n , there exists a control which transfers the system from 

x at t = 0 to x, at t = t, and minimizes 
oil 

n - f 1 u' (t)u(t)dt (3.18) 

'n 

Moreover, there exists a matrix K q such that the optimal 

control is given by u^(t) = <K(t),B^> where K is defined by the matrix 

differential equation 

m 

K(t)= [A,K(t) ] + l <K(t),B.>[K(t),B ] ; K(0) = K (3.19) 

. 1=1 

We complete this section on optimal control with a result of the 
type which plays a major role in linear system theory in connection with 
the regulator problem. 

Theorem 6: Let A and B be n by n skew symmetric matrices and consider 

the system 

x(t) » Ax(t) + u(t)Bx(t) (3.20) 

Let a be a unit vector in the null space of A such that A and Baa'B' are a 
pair of matrices which act irreduciBly on the orthogonal complement of 
the one dimensional subspace defined by a. Then the control law u(t) * 
a'Bx(t) steers the system from any initial state x q j -a to a and minimizes 
the integral 



n ** J u^(t) + [a'Bx(t)]^dt 
'0 

relative to any other continuous control u(«)» 

Proof : We can write ri as 


since A a 


n - 

0 we have 

n = 


r 

J o 

r 

'0 


u^(t)-2a'x(t) + [a'Bx(t) ]^dt+2a'x(t) 


0 


(u(t)-a'Bx(t))^dt+2.a , x(t) 


00 

0 


Thus if the control law u(t) = a'Bx(t) actually drives the state x to 

a then it is optimal. However, observing that a'x(t) has a derivative 

2 

along the given solution which is equal to -fa'Bx(t)] , we see by 

LaSalle's theorem (see e.g. [2]) that the solution x = a can fail to be 

At 

stable if and only if a' Be x vanishes identically for some x ^ ±a. 

By looking at the derivatives at t » 0 we see that this can happen if 

and only if (Ba,ABa, . . ,A n *Ba) fails to span the orthogonal complement of the one 

dimensional sub space defined by a. 


* 


* 


J 
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4. Stochastic Differential Equations 

We consider now a third aspect of control theory on spheres. 

This has to do with the analog of property (v) mentioned in the intro- 
duction. What we show is that controllability implies the exlstance 
of a unique invariant measure for a stochastic equation on S n . We use 
Ito notation for stochastic differential equations. Wong [3] can be 

consulted for an explanation of both the mathematics and the notation. 

Let w, .w„,...,w denote independent Wiener (Brownian motion) 

1 z m 

processes of unity variance. In giving a precise meaning to differential 

equations in which something like "white noise" appears K. Ito [16] 

invented what has proven to be a very successful calculus in which the 

standard differentiation rule is significantly modified Insofar as 

differentials of Wiener processes are concerned. In this calculus dw^dw^ 

2 

6^jdt, a first order term; dw^dt, and (dt) are both higher than first 
order. We discuss the implication of this in one important special case. 
If x and y are vectors satisfying the Ito differential eouations 

dx(t) = Ax(t)dt + Bx(t)dw(t) (4.1) 

dy (t) = Fy(t)dt + Gy(t)dw(t) (4.2) 

Then z(t) = x(t)y' (t) satisfies the Ito equation 

dz(t) = (Az(t)+z(t)F' + Bz(t)G*)dt + (Bz(t)+z(t)C?)dw (4.3) 


The only other fact we need about Ito equations concerns the associated 

mean equation. If x and y satisfy equations (4.1) and (4.2) then 

x(t) = cfx(t) and y(t) = <f!y(t) satisfy the ordinary differential equation 

x(t) = Ax(t) (4.4) 


y(t) => Fy(t) 


(4.5) 
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We will see that these two results nermit the derivation of equations 
for all moments and imply that the moment equations are decoupled from 
each other. 

Recall that the number of linearly independent degree p forms in 
n variables is given by 

N (n,p) *(^ n+ p ^ (4* 6 ) 

We can therefore associate with each n tuple • • • » x n ) a N(n,p)-tuple 

x tp ^ « (x p , x p_1 x 2 ,...,x p ) where the coefficients are chosen in such a 

way as to validate the equality 

IMI* M,7> 

It is clear that if x satisfies an ordinary differential equation which 
is linear, say 

~ x(t) - Ax(t) (4.8) 

then x^ also satisfies a linear differential equation 

*^(t) » A^x(t) (4.9) 

We regard this as a definition of A^. It is related to the classical 
idea of an induced representation. Of course if there are controls present 
a similar set of equations follow; i.e. equation (2.1) implies 

x [pl (t) - A^*[?J(t) + l u.(t)BK^(t) (4.10) 

i=l 

Similar remarks hold for stochastic equations of the type under 

consideration here, provided suitable allowance is made for the Ito 

calculus. Associated with the Ito equation 

m 

dx(t) =* Ax(t)dt + £ B x(t)dw. 

i-1 1 


* 




> 




(4.11) 
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ls the family of equations 


d*t”(t>.«A- f 4 .*>«♦? 4 »W>V*W ? . 


m 

l 

i=l 


m 

l 

1=1 


(4.12) 

The derivation of this is a straightforward exercise using the properties 
of dw^ outlined above. Finally, we have the moment equations associated 


with (4.11) 




(4.13) 


dt i=l 

where x^(t) *»£x^(t). Compare with reference 17. 

In terms of the Ito calculus when can the matrix stochastic equation 


m 


dX(t) » AX(t)dt + l dw (t)B.X(t) 

i-1 


(4.14) 


be thought of as evolving the orthogonal group? This will be the case 
when the associated vector equation (4.11) evolves on the sphere defined 
by ||x(t)|| = | | x (0) | ( for all x(0). Using the facts outlined above 
we see that d(x'x) » 0 if and only if for all i 

m 


B i " - B i ’ 


- i K--( a - i i b \y 

i=l 1 1 i=l 1 


(4.15) 


Thus these are the conditions under which equation (4.14) evolves in the 
orthogonal group and the conditions under which (4.11) evolves on the 
sphere. 

It is apparent that the measure associated with the uniform density 

on the sphere is an invariant measure for the process defined by equation 

\ 

(4.11). Since the area of the (n-l)-*sphere is 2ir n ^/r(n/2) the uniform density 
is 

„n/2 


P o (x) = r (n/2)/2ir 


(4.16) 



The corresponding values of the odd moments are zero by symmetry but the 

even moments are not. The following theorem claims that all the moments 

approach the moments associated with a uniform distribution if we have 

controllability. Incidentally, equation (4.13) provides a means for actually 

computing the moments for all time in terms of their values at t ■ 0. 

Theorem 7 : Suppose that . . .B^ are all skew symmetric and suppose that 

m 

x(t) = (A + l u i (t)B i )x(t) (4.17) 

i«l 

is controllable on S n Then the solution of the Ito differential 
equation defined on the sphere by 

® 1 A ® 

dx(t) ;« (A + l ± Bj)x(t)dt + l B 1 x(t)dw 1 (4.18) 

i°l i=l 

is such that all moments approach the moments associated with a uniform 

distribution on the n-1 sphere as t approaches infinity. 

Proof r First of all, note the shift in notation from (4.11) to (4.18). 

1 2 

In (4.11) A- is playing the role played by A alone here. It is 

not difficult to show that because A,B^,B2» . • .B^ are skew symmetric it 
follows that A^^, b|^^ jB^^ , . . .B^^ are also skew symmetric. A second 
observation concerns stability. If A = -A' and B^ = -B^ then all 
solutions of the ordinary differential equation 

x(t) - (A + l \ B?)x(t) (4.19) 

i=l z 1 

are bounded. Moreover, each solution approaches zero as t approaches 

At 

infinity provided B^e x does not vanish identically for any x ^ 0 and 

At 

there will exist nonzero vectors such that B^e x vanishes identically 
if and only if A and B^ can be put in*the form 
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o'ao » 

To prove the first of these facts we notice that since A “ -A' 

! |x(t)| I 2 - - I ||B i x(t)|| 2 (4.21) 

i=l 

Thus by LaSalle's theorem (see e.g. [2]) the solution either goes to zero 

or else there is a solution along which | |B^x(t) | j vanishes identically 

At 

for all i. That solution would have to be of the form e x . As for the 

o 

conditions on A and B^, they follow from considering the subspace of 

At 

vectors such that B^e x vanishes, together with its orthogonal complement, 

making use of the skew symmetry of A,B. ,B 0 , . . .B . 

1 z m 

Clearly controllability implies that all solutions of the mean 
equation approach zero as t approaches infinity because controllable 
systems cannot be decomposed as indicated. As for the higher moments, 
we must distinguish between the even and odd cases. For the odd cases 
if there' is a decomposition then controllability of the equation (4.17) 

is clearly impossible. For the even moments, we have in view of 
the identity | |x^ | ■ | |x| a decomposition of the type given by 

equation (4.20) but with the zero block in being one dimensional. 

The one dimensional subspace defines the steady state value of the 
even moments. On the orthogonal complement the equation (4.18) is 
asymptotically stable. These remarks are related to some well known 
properties of orthogonal representations of Lie algebras. 

i* 


n 

A„ 


6'B 0 


B i 0 


(4.20) 
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As is well known, the moments x are related to the spherical 
harmonics in a direct way. Thus by working with equation (4.13) it 
is possible to obtain a full solution to the Fokker-Plank equation 
associated with the Ito equation (4.18). The interpretation of the 
moments in terms of spherical harmonics also allows one to establish 
some qualitative features of the probability density. In particular 
its smoothness and convergence to the steady state can be easily 


studied. 



- 105 - 


References 


l 


1 


1. R.W. Brockett, "System Theory on Group Manifolds and Coset 
Spaces." SIAM J. on Control , Vol. 10, No. 2, May 1972, 

pp. 265-284. 

2. R.W. Brockett, Finite Dimensional Linear Systems . J. Wiley, 

New York, 1970. 

3. E. Wong, Stochastic Processes in Information and Dynamical 
Systems , McGraw-Hill, 1971. 

4. R. Hermann, "On the Accessibility Problem in Control Theory," 
International Symposium on Nonlinear Differential Equations and 
Nonlinear Mechanics, Academic Press, N.Y., 1963, pp. 325-332. 

5. C. Lobry, "Controlabilite des Systems non Llnearies " SIAM J . on 
Control , 8 (1970), pp. 573-605. 

6. G.W. Haynes and H. Hermes, "Nonlinear Controllability via Lie 
Theory. v SIAM J. on Control , 8 (1970), pp. 450-460. 

7. J. Kucera, "Solution in Large of Control Problem: x ■ (A(l-u)+Bu)x," 

Czech . Math . J . , 16 (1966), no. 91, pp. 600-623. 

8. J. Kucera, "Solution in Large of Control Problem: x ■ (Au+Bu)x," 

Czech. Math. J. . 17 (1967), no. 92, pp. 91-96. 

9. J. Kucera, "On Accessibility of Bilinear Systems," Czech. Math. J .. 

20, (1970), no. 95, pp. 160-168. 

10. V. Jurdjevic and H.J. Sussmann, "Control Systems on Lie Groups," 

J. Differential Equations . Vol. 12, No. 2, (1972) pp. 313-329. 

11. H. Same Is on, "Topology of Lie Groups," Bui. American Math. Soc. . 

Vol. 58 (1952), pp. 2-37. 

12. H. Samelson, Notes on Lie Alfcebras . Van Nostrand Reinhold Co., 1969. 

13. R.W. Brockett, "On the Algebraic Structure of Bilinear Systems," 

in Theory and Applications of Varldale Structure Systems , (A. Ruber ti and 
R. Mohler eds.) Academic Press, N.Y. 1972. 

14. L.S. Pontryagin, V. Boltyanskii, R. Gamkrelidze, and E. Mishchenko: 

The Mathematical Theory of Optimal Processes . Interscience Publishers, 
Inc., N.Y. , 1962. 

15. L. Cesari, "Existence Theorems for Optimal Solutions in Lagrange 
and Pontryagin Problems," SIAM .T. Control 3(1965), 475-498. 

16. K. Ito, "Stochastic Differential Equations on a Differentiable Manifold," 
Nagoya Math. J. . 1, 35-47 (1950). 

17. R.W. Brockett and J.C. Willems, "Average Value Criteria for Stochastic 
Stability," Symposium on Differential Equations and Dynamical Systems, 
Springer Verlag Lecture Notes on Mathematics, Vol. 206, 1972. 



DISTRIBUTION LIST 


NASA NGR 22-007-172 


NASA Lewis Research Center 
Project Manager 
21000 Brookpark Road 
Cleveland, OH 44135 

Attn. 0170/V. R. Lalli, M. S. 500-211 (31 


NASA Ames Research Center 
Ames Research Center 
Moffett Field, CA 94035 
Attn; Library 


NASA Lewis Research Center 
Procurement Manager 
21000 Brookpark Road 
Cleveland, OH 44135 

Attn: 1400/F. H. Stickney, M. S. 500-302 


NASA Lewis Research Center 
Patent Counsel 
21000 Brookpark Road 
Cleveland, OH 44135 

Attn: 1004/N. T. Musial, M. S. 500-311 


NASA Scientific and Technical 
Information Facility 
NASA Headquarters 
Box 5700 
Bethesda, MD 

Attn: NASA Representative (3) 


NASA- Lewis Research Center 
Lewis Library 
21000 Brookpark Road 
Cleveland, OH 44135 
Attn: Library, M. S. 60-3 (2) 


NASA Lewis Research Center 
Lewis Management Services Div. 

21000 Brookpark Road 
Cleveland, OH 44135 

Attn: Report Control Office, M. S. 5-5 


U. S. Atomic Energy Commission 
Technical Reports Library 
Washington, DC 20545 


U. S. Atomic Energy Commission 
Technical Information Service Extension 
P. O. Box 62 ' 

Oak Ridge, TN 37830 (3) 


NASA Manned Spacecraft Center 
Houston, TX 77001 
Attn. Library 


NASA Marshall Space Flight Center 
Huntsville, AL 35812 
Attn: Library 


National Aeronautics and Space 
Administration 

NASA Headquarters Program Office 
Washington, D. C. 20546 
Attn: PY/F.D. Hansing 

Forward to 

RPM/P. T. Maxwell (2) 


NASA Lewis Research Center 
Lewis Research Center Staff Members 
21000 Brookpark Road 
Cleveland, OH 44135 

Attn: 5224/C. S. Corcoran, M. S. 500-202 

5220/D. R. Packe, M. S. 500-201 (2) 


NASA Flight Research Center 
Flight Research Center 
P. O. Box 273 
Edwards, CA 93523 
Attn: Library 


NASA Goddard Space Flight Center 
Goddard Space Flight Center 
Greenbelt, MD 20771 
Attn: Library 


Jet Propulsion Laboratory 
4800 Oak Groove Dr. 
Pasadena, CA 91103 
Attn: Library 


NASA Langley Research Center 
Langley Station 
Hampton, VA 23365 
Attn: Library 


NASA Western Operations 
150 Pico Blvd. 

Santa Monica, CA 90406 
Attn: Library 


U. S. Department of Transportation 
Transportation Systems Center 
Cambridge, MA 02142 
Attn: F. L. Raposa 


University City Science Institute 
Power Information Center, RM 2107 
3401 Market Street 
Philadelphia, PA 19104 (2) 


Duke University 

College of Engineering 

Dept, of Electrical Engineering 

Durham, NC 27706 

Attn; Professor T. G. Wilson 


U. S. Air Force Aeropropulsion Lab. 
Wright Patterson AFB 
Dayton, OH 45433 
Attn: Robert Johnson 


U. S. Army R and D Laboratory 

Ft. Monmouth, NJ 07703 

Attn: Frank Wrublewski AMSEL-KL-PE 


TRW Systems, Inc. 

Attn: A. D. Schoenfeld, M. S. R6, Rm. 2591 
One Space Park 
Redondo Beach, CA 90278 


California Institute of Technology 
Attn; Prof. R. D. Middlebrook 

Electrical Engineering Dept. 
Pasadena, CA 91109 (2) 


NASA Lewis Research Center 
Lewis Office of Reliability and 
Quality Control 
21000 Brookpark Road 
Cleveland, OH 44135 
Attn: 0170/W. F. Dankhoff 500-211 



